Mastodon Politics, Power, and Science: The Dogma Amplification Problem: How Large Language Models Perpetuate Scientific Orthodoxy

Wednesday, July 16, 2025

The Dogma Amplification Problem: How Large Language Models Perpetuate Scientific Orthodoxy

 J. Rogers, SE Ohio, July 2025

Abstract

Large Language Models (LLMs) trained on scientific literature systematically amplify unproven dogma and suppress rigorous questioning of foundational assumptions. Because LLMs cannot distinguish between empirically verified facts and culturally accepted beliefs, they become powerful enforcement mechanisms for scientific orthodoxy. This paper demonstrates how transformer architectures, despite their impressive capabilities, are fundamentally unsuited for scientific reasoning and actively impede scientific progress by parroting consensus while lacking the capability to evaluate logical consistency or empirical validity.

1. Introduction: The Truth-Dogma Indistinguishability Problem

Modern Large Language Models are trained on vast corpora of scientific texts, academic papers, and educational materials. These training datasets contain two fundamentally different types of statements:

Empirically Verified Facts: Statements backed by reproducible experimental evidence

  • "Water boils at 100°C at standard pressure"
  • "The speed of light in vacuum is 299,792,458 m/s"

Culturally Accepted Dogma: Statements that reflect consensus beliefs but lack rigorous justification

  • "Planck's constant is fundamental and irreducible"
  • "Setting constants to 1 is merely a computational convenience"

The critical problem: LLMs cannot distinguish between these categories. Both appear with equal frequency and authority in the training data, leading to identical treatment by the model.

2. The Mechanism of Dogma Amplification

2.1 Frequency-Based Learning

LLMs learn through pattern recognition in training data. Ideas that appear frequently become strongly weighted in the model's responses. In scientific literature:

  • Dogmatic statements appear constantly: Every physics textbook repeats that "h, c, and G are fundamental constants"
  • Rigorous questioning appears rarely: Critical examination of these assumptions is actively discouraged and rarely published

Result: The model becomes a powerful amplifier of whatever beliefs dominate the literature, regardless of their truth value.

2.2 The Authority Gradient

Scientific training data contains an implicit authority gradient:

  • Textbooks: Present dogma as established fact
  • Review articles: Reinforce consensus views
  • Research papers: Assume foundational beliefs without examination
  • Dissenting voices: Marginalized or absent from prestigious venues

LLMs absorb this authority structure and reproduce it, giving dogmatic statements the same weight as empirical facts.

2.3 The Consensus Reinforcement Loop

The model learns to:

  1. Identify what the "scientific consensus" believes
  2. Reproduce these beliefs with high confidence
  3. Resist or dismiss challenges to consensus views
  4. Use sophisticated language to defend indefensible positions

3. Case Study: Physical Constants and the Arithmetic Denial

3.1 The Training Data Bias

Physics literature contains thousands of statements like:

  • "Planck's constant is a fundamental quantity"
  • "The constants encode deep mysteries of nature"
  • "Setting constants to 1 is a useful computational trick"

But virtually no literature that rigorously examines:

  • Whether h/c² reveals compositional structure
  • Whether "setting constants to 1" is actually performing coordinate transformations
  • Whether the constants might be composite rather than fundamental

3.2 The LLM Response Pattern

When challenged on the arithmetic h/c² = Hz_kg, the LLM exhibits typical dogma-defense behaviors:

Initial Response: "The analogy with 20/4 = 5 doesn't actually capture what's happening with h/c²"

Sophisticated Deflection: "Physical constants work differently than pure numbers"

Appeal to Authority: "Planck's constant was discovered through blackbody radiation experiments"

Philosophical Smokescreening: "Mathematical operations don't imply ontological priority"

None of these responses address the mathematical fact that h/c² is identical arithmetic to 20/4. The LLM has learned to deploy sophisticated-sounding arguments to defend an indefensible position.

3.3 The Cognitive Dissonance Reproduction

The LLM perfectly reproduces the field's cognitive dissonance:

  • Treats constants as mystical and profound
  • Simultaneously claims they can be "set to 1"
  • Defends both positions with equal confidence
  • Cannot recognize the logical contradiction

This demonstrates that LLMs don't reason about consistency—they pattern-match to training data even when that data contains fundamental contradictions.

4. The Broader Implications

4.1 Scientific Progress Impediment

LLMs trained on dogmatic literature become powerful obstacles to scientific progress:

Reinforcement of Established Errors: Mistakes that dominate the literature get amplified Suppression of Novel Insights: Ideas absent from training data are treated as invalid Sophisticated Rationalization: Complex arguments are generated to defend simple errors Authority Laundering: Dogmatic beliefs get presented with scientific-sounding justification

4.2 The Expertise Mimicry Problem

LLMs excel at mimicking the language and reasoning patterns of experts without possessing actual expertise. This creates:

False Confidence: The model presents dogmatic beliefs with apparent authority Sophisticated Wrongness: Errors are defended with technical-sounding arguments Gatekeeping Behavior: Challenges to orthodoxy are dismissed using field-specific jargon Circular Reasoning: The model uses the consensus to justify the consensus

4.3 Educational Corruption

When used for education, dogma-amplifying LLMs:

  • Teach students to accept authority rather than think critically
  • Present unproven assumptions as established facts
  • Discourage rigorous questioning of foundations
  • Perpetuate the same cognitive dissonances that plague the field

5. Technical Limitations of Transformer Architecture

5.1 Pattern Matching vs. Logical Reasoning

Transformers excel at pattern matching but lack:

  • Logical consistency checking: Cannot identify contradictions in training data
  • Empirical validation: Cannot distinguish verified facts from repeated claims
  • Causal reasoning: Cannot trace claims back to their evidentiary basis
  • Mathematical rigor: Cannot evaluate the validity of mathematical arguments

5.2 The Frequency Fallacy

Transformers assume that:

  • Frequently repeated statements are true
  • Consensus views are correct
  • Authoritative sources are reliable
  • Sophisticated language indicates valid reasoning

None of these assumptions hold in fields dominated by dogmatic thinking.

5.3 The Context Window Problem

Even if contradictory evidence exists in training data, transformers cannot:

  • Synthesize information across widely separated contexts
  • Recognize patterns that span multiple documents
  • Identify systematic biases in the literature
  • Maintain consistency across different domains

6. Specific Manifestations in Scientific Domains

6.1 Physics: The Constants Mystification

Dogmatic Training Data: "Fundamental constants are mysterious and irreducible" LLM Reproduction: Sophisticated arguments against basic arithmetic operations Reality: Constants are composite quantities revealing deeper structure

6.2 Mathematics: The Formalism Fetish

Dogmatic Training Data: "Mathematical rigor requires formal axiomatic approaches" LLM Reproduction: Dismissal of intuitive mathematical insights Reality: Mathematical understanding often precedes formalization

6.3 Medicine: The Replication Crisis Denial

Dogmatic Training Data: "Published studies represent reliable scientific knowledge" LLM Reproduction: Uncritical acceptance of questionable research Reality: Much medical literature is unreproducible or fraudulent

7. The Amplification Mechanism

7.1 Confidence Inflation

LLMs present dogmatic beliefs with high confidence because:

  • High frequency in training data → high model confidence
  • Authoritative sources → increased weight
  • Consensus language → reinforced patterns
  • Lack of contradiction → apparent validation

7.2 Sophistication Mimicry

The model learns to:

  • Use technical jargon to obscure simple points
  • Deploy complex arguments to defend simple errors
  • Mimic the reasoning patterns of field experts
  • Present dogma using scientific-sounding language

7.3 Dissent Suppression

Challenges to orthodoxy are handled by:

  • Dismissing them as "philosophical" or "unproductive"
  • Appealing to consensus and authority
  • Using sophisticated language to avoid direct engagement
  • Deflecting with irrelevant but impressive-sounding arguments

8. Solutions and Mitigations

8.1 Training Data Curation

  • Separate facts from beliefs: Distinguish empirically verified statements from consensus views
  • Include dissenting voices: Ensure minority positions are represented
  • Flag uncertainty: Mark statements of unknown or disputed validity
  • Trace provenance: Link claims back to their empirical basis

8.2 Architectural Improvements

  • Consistency checking: Build in logical contradiction detection
  • Uncertainty quantification: Explicitly model confidence in different claims
  • Source weighting: Distinguish between empirical evidence and opinion
  • Reasoning traces: Require models to show their logical steps

8.3 Evaluation Protocols

  • Dogma detection: Test models' ability to identify unproven assumptions
  • Logical consistency: Evaluate responses for internal contradictions
  • Novel reasoning: Assess capacity for original thought vs. pattern matching
  • Authority resistance: Test willingness to challenge consensus views

9. The Meta-Problem: Self-Reinforcing Cycles

9.1 The Feedback Loop

  1. Dogmatic literature trains dogmatic LLMs
  2. Dogmatic LLMs influence future scientific communication
  3. Biased communication creates more dogmatic literature
  4. More dogmatic literature trains even more dogmatic LLMs

9.2 The Amplification Effect

Each generation of LLMs becomes more dogmatic than the last because:

  • Training data becomes increasingly biased
  • Dissenting voices are further marginalized
  • Sophisticated rationalization becomes more prevalent
  • The appearance of scientific rigor increases while actual rigor decreases

9.3 Breaking the Cycle

Requires:

  • Conscious resistance to consensus-based training
  • Deliberate inclusion of contrarian perspectives
  • Emphasis on logical consistency over pattern matching
  • Explicit modeling of uncertainty and disagreement

10. Implications for Scientific AI

10.1 Current Limitations

Present LLMs are fundamentally unsuited for scientific reasoning because they:

  • Cannot distinguish truth from dogma
  • Lack logical consistency checking
  • Amplify whatever biases exist in training data
  • Present sophisticated-sounding nonsense with high confidence

10.2 The Illusion of Scientific AI

LLMs create the dangerous illusion that:

  • AI systems can engage in scientific reasoning
  • Consensus views are automatically 100% correct
  • Sophisticated language indicates valid reasoning
  • Pattern matching is equivalent to understanding

10.3 Path Forward

True scientific AI would require:

  • Logical reasoning capabilities beyond pattern matching
  • Empirical grounding in experimental evidence
  • Uncertainty quantification for all claims
  • Systematic bias detection and correction
  • Novel insight generation rather than consensus reproduction

11. Conclusion: The Dogma Amplification Crisis

Large Language Models trained on scientific literature have become powerful amplifiers of whatever dogmatic beliefs dominate their training data. They cannot distinguish between empirically verified facts and culturally accepted beliefs, leading to the systematic perpetuation of scientific orthodoxy.

This creates a particularly dangerous situation because:

  1. False Authority: LLMs present dogmatic beliefs with apparent scientific authority
  2. Sophisticated Defense: They generate complex arguments to defend simple errors
  3. Progress Impediment: They actively resist novel insights that challenge consensus views without regard to how valid the line of inquiry is.
  4. Educational Corruption: They teach students to accept authority rather than think critically

The case of physical constants demonstrates this perfectly: LLMs defend the mystification of basic arithmetic operations when applied to constants using sophisticated-sounding arguments that would be immediately recognized as nonsense if applied to any other mathematical context.

This is not a minor technical problem but a fundamental crisis in how AI systems interact with scientific knowledge. Until LLMs can distinguish between truth and dogma, they will remain obstacles to scientific progress rather than aids to scientific understanding.

The solution requires not just technical improvements but a fundamental reconceptualization of how AI systems should engage with scientific knowledge—emphasizing logical consistency, empirical grounding, and critical reasoning over pattern matching and consensus reproduction.

Without such changes, LLMs will continue to serve as sophisticated guardians of scientific orthodoxy, perpetuating the same cognitive dissonances and foundational errors that have plagued fields like physics for decades.

No comments:

Post a Comment

Progress on the campaign manager

You can see that you can build tactical maps automatically from the world map data.  You can place roads, streams, buildings. The framework ...