Thursday, April 30, 2020

Python - How to read a file?



   In this article, we will see how to read from a file. Let us have file with content as below:
$ cat file
Unix
Linux
AIX
1.  Reading line by line:
>>> fd = open('/home/guru/file', 'r')
>>> for line in fd:
...   print(line)
... 
Unix

Linux

AIX

>>> 
         open function opens a file and returns a file object.  Arguments are filename and the mode, 'r' for reading.  When the file object is traversed, every iteration reads a line and line gets printed. Notice the blank line after every print. This is because the variable line contains a newline character read from the file and print function adds another newline.

  One way to remove the blank line is by suppressing the print function to print without newline. This is done by passing an empty value to end. Since, the default mode is read, it can be removed. 
>>> fd = open('/home/guru/file')
>>> for line in fd:
...   print(line, end='')
... 
Unix
Linux
AIX
>>> 
   Another way to not print empty line is to ensure the newline character is stripped from the line variable using the strip function. 
>>> fd = open('/home/guru/file')
>>> for line in fd:
...   print(line.strip())
... 
Unix
Linux
AIX
>>> 
2. Reading using readline :
   readline function reads one line at a time. Once the end of the file has reached, it returns empty string.
>>> fd = open('/home/guru/file')
>>> fd.readline()
'Unix\n'
>>> fd.readline().strip()
'Linux'
>>> fd.readline().strip()
'AIX'
>>> fd.readline().strip()
''
>>> 
3. Read using readlines:
>>> fd = open('/home/guru/file')
>>> for line in fd.readlines():
...   print(line.strip())
... 
Unix
Linux
AIX
>>> 
readlines reads the entire file and returns a list.

4. Read a file into a list:
>>> fd = open('/home/guru/file')
>>> l1 = fd.readlines()
>>> l1
['Unix\n', 'Linux\n', 'AIX\n']
     Since readlines returns a list, it can directly be assigned to a list variable.  However, it has newline characters at the end.
>>> l1 = map(str.strip,fd.readlines())
>>> list(l1)
['Unix', 'Linux', 'AIX']
>>> 
   Using map, we strip off the newline characters of every element from the list using the strip function.

List comprehension:
>>> fd = open('/home/guru/file')
>>> [ line for line in fd ]
['Unix\n', 'Linux\n', 'AIX\n']
>>> 
With newline characters stripped,
>>> [ line.strip() for line in fd ]
['Unix', 'Linux', 'AIX']
>>> 
5. Read a file using read function:
    read function reads the entire file and returns a string.
>>> fd = open('/home/guru/file')
>>> fd.read()
'Unix\nLinux\nAIX\n'
   By stripping the newline characters from the string, it can be converted to a list.
>>> fd.seek(0)
0
>>> fd.read().split('\n')
['Unix', 'Linux', 'AIX', '']
>>> 
   Notice the extra element, an empty one at the end. It is due to the last newline character. Once read is called, the file object reaches to the end of the file. In order to read again, we have to take the file to the beginning which is done using seek function.
>>> fd.seek(0)
0
>>> fd.read().strip().split('\n')
['Unix', 'Linux', 'AIX']
>>> 
>>> fd.closed
False
>>> fd.close()
>>> 
   By stripping the newline before doing split, we will get the correct result. closed tells whether a file is closed or not. We always have to manually close the file using close function.

6. Reading with 'with open' :
      Python3 came up with a new 'with open' syntax for file handling. 
>>> with open('/home/guru/file') as fd:
...   for line in fd:
...     print(line.strip())
... 
Unix
Linux
AIX
>>> fd.closed
True
>>> 
   The advantage of this is file is automatically closed after reading. This is true even if an exception occurs.

No comments:

Post a Comment