Read and Write Files

  • We have several sequences stored in files and want to do something with them
  • Example: Calculate distance between sequences and write distances to file

 

sequence1.txt

GATCGTTCG

 

sequence2.txt

CATGGTTGA

 

  • Program, which calculates distance between two sequences

readSeq.py

import sys
import Dist

inputfile1 = open(sys.argv[1], 'r')

inputfile2 = open(sys.argv[2], 'r')

A = inputfile1.readline()

B = inputfile2.readline()

inputfile1.close()

inputfile2.close()

print("Distance between A and B is " + str(Dist.d(A,B)))

 

 

Execution of the Program

  • In order to execute the program we call the interpreter from the command line using the filename as first argument, the first file as second and the third file as third argument
$ python3 readSeq.py sequence1.txt sequence2.txt

 

  • Program, which calculates distance between two sequences and writes distance to a file

 

import sys
import Dist

inputfile1 = open(sys.argv[1], 'r')
inputfile2 = open(sys.argv[2], 'r')

A = inputfile1.readline()
B = inputfile2.readline()

inputfile1.close()
inputfile2.close()


outputfile = open(sys.argv[3], 'w')
outputfile.write("Distance between A and B is " + str(Dist.d(A,B)))
outputfile.close()

 

  • We have several sequences stored in files and want to do something with them
  • Example: Calculate distance between sequences and write distances to file

sequence1b.txt

GATCGTTCG
TCGTT
ATCGTAA
GTGGTTGA
AGTCGT

sequence2.txt

CATGGTTGA

 

import sys
import Dist


inputfile1 = open(sys.argv[1], 'r')
inputfile2 = open(sys.argv[2], 'r')

Asequences = inputfile1.readlines()
B = inputfile2.readline()

inputfile1.close()
inputfile2.close()

outputfile = open(sys.argv[3], 'w')

for A in Asequences:
outputfile.write("Distance between " + A + " and " + B + " is " + str(Dist.d(A,B)) + "\n")

outputfile.close()

 

import sys
import Dist


inputfile1 = open(sys.argv[1], 'r')
inputfile2 = open(sys.argv[2], 'r')

Asequences = inputfile1.readlines()
Bsequences = inputfile2.readlines()

inputfile1.close()
inputfile2.close()

outputfile = open(sys.argv[3], 'w')

for A in Asequences:
for B in Bsequences: 
outputfile.write("Distance between " + A + " and " + B + " is " + str(Dist.d(A,B)) + "\n")

outputfile.close()

 

  • There are many other ways how to read in files from the command line
    One of the fastest ways:
with open("<your_file>") as f:

    for line in f:

        <do something with line>

 

You can list all files from a directory, e.g.

import glob


print(glob.glob("*"))

 

  • Many other Functionalities
  • You can run other programs from within python
import os


os
.system(<run_program>)
  • You can start python programs within bash scripts
  • And many other things...

 

References

A full book about Python freely available for download

    "How to think like a computer scientist“  With examples in Python!

 

More information: