DNA String Analysis: Computable Functions

Aidan Dyga, Molly Suppo, Anoop Guragain

October 27, 2025

Introduction to DNA String Analysis

  • DNA sequences consist of four genetic symbols: \(\{A, C, T, G\}\)
  • We analyze these sequences by counting and comparing symbol occurrences
  • These operations are computable: they always terminate and produce correct results

Key insight: String analysis problems with finite alphabets and deterministic counting are decidable.

Problem Statement

  • Given variables \(N\) and \(M\) in the set \(\{A, C, T, G\}\) where \(M \neq N\)
  • Goal: Implement functions to analyze DNA strings:
    • Count occurrences of specific symbols
    • Compare frequencies between different symbols

Building countCs: Counting a Specific Symbol

  • The count_cs function searches for “C” and is a specific case.
  • count_cs is computable and tractable.

Building countNs: Counting Any Symbol

  • The count_ns function searches for the value of n and is a general case.
  • count_ns is computable and tractable.

Building more_cs_than_gs: Symbol Comparison

  • The more_cs_than_gs function compares the count of “C” and “G” symbols.
  • Uses count_cs and count_ns for clarity and reuse.
  • Returns True if there are more Cs than Gs, else False.

Building has_more_ns_than_ms: General Comparison

  • The has_more_ns_than_ms function generalizes symbol comparison.
  • Compares the count of any two symbols, n and m.
  • Returns True if there are more n symbols than m symbols.

Why These Functions Are Computable

  • Each function performs Finite Operations (counting and comparison).
  • Python’s str.count() method scans the string once, it always terminates.
  • There are no infinite loops.
  • Therefore, these functions are Decidable and Computable.

Computational Complexity Analysis

  • Each count() call scans the string once which is O(n).
  • Comparisons are O(1) operations.
  • Overall complexity:
    • count_cs: O(n)
    • count_ns: O(n)
    • more_cs_than_gs: O(n)
    • has_more_ns_than_ms: O(n)

Conclusion

  • Computability: DNA string analysis problems are decidable and tractable
  • Implementation: Simple counting and comparison operations suffice
  • Theory: Demonstrates clear distinction between computable and uncomputable problems

Take-home message: Not all comquaputational problems are equally difficult; understanding computability helps us recognize solvable problems.

Sources

  • Course Textbook
  • Python str.count() documentation
  • GitHub Copilot
  • Microsoft Copilot