Read a text file into a string and strip newlines in Python

David Y.
jump to solution

The Problem

I have a text file (dna.txt) containing multiple lines, for example:

ATCAGTGGAAACCCAGTGCTA
GAGGATGGAATGACCTTAAAT
CAGGGACGATATTAAACGGAA

Using Python, how do I read it into a string variable as one long line, i.e. removing newlines? I want the final string to look like this:

ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

The Solution

We can achieve this using the following Python code:

with open("dna.txt", "r") as file:
    dna = file.read().replace("\n", "")

print(dna)

# will print ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

In the above code:

  • open("dna.txt", "r") opens the file in read mode (r). We use Python’s with statement to automatically close the file at the end of the block.
  • file.read() reads the entire contents of the file into a string.
  • replace("\n", "") is a string method that replaces all newline characters in our string with empty strings.

In some cases, we may prefer to replace newlines with other characters, such as a single space. We can do this with a slight modification to the above code:

with open("dna.txt", "r") as file:
    dna = file.read().replace("\n", " ") # replace newline with space

print(dna)

# will print ATCAGTGGAAACCCAGTGCTA GAGGATGGAATGACCTTAAAT CAGGGACGATATTAAACGGAA

An alternative but less explicit way to produce the same output would be to use str.splitlines and str.join. This will create a list containing each line in the file, and then convert that list into a string with a specified delimiter. We can use an empty string to remove the new lines entirely:

with open("dna.txt", "r") as file:
    dna = "".join(file.read().splitlines())

print(dna)

# will print ATCAGTGGAAACCCAGTGCTAGAGGATGGAATGACCTTAAATCAGGGACGATATTAAACGGAA

Alternatively, we could use any other string to separate the lines with that string:

with open("dna.txt", "r") as file:
    dna = " ".join(file.read().splitlines())  # separate lines with a single space

print(dna)

# will print ATCAGTGGAAACCCAGTGCTA GAGGATGGAATGACCTTAAAT CAGGGACGATATTAAACGGAA

While both of these approaches produce the same output, the second one may be confusing to readers unfamiliar with Python.

Considered "not bad" by 4 million developers and more than 150,000 organizations worldwide, Sentry provides code-level observability to many of the world's best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

Sentry