# Work with files in Python

You previously explored how to open files in Python as well as how to read them and write to them. You also examined how to adjust the structure of file contents through the <var>.split()</var> method. In this reading, you'll review the <var>.split()</var> method, and you'll also learn an additional method that can help you work with file contents.

## Parsing

Part of working with files involves structuring its contents to meet your needs. **Parsing** is the process of converting data into a more readable format. Data may need to become more readable in a couple of different ways. First, certain parts of your Python code may require modification into a specific format. By converting data into this format, you enable Python to process it in a specific way. Second, programmers need to read and interpret the results of their code, and parsing can also make the data more readable for them.

Methods that can help you parse your data include <var>.split()</var> and <var>.join()</var>.

## .split()

### **The basics of .split()**

The <var>.split()</var> method converts a string into a list. It separates the string based on a specified character that's passed into <var>.split()</var> as an argument.

In the following example, the usernames in the <var>approved\_users</var> string are separated by a comma. For this reason, a string containing the comma (<var>","</var>) is passed into <var>.split()</var> in order to parse it into a list. Run this code and analyze the different contents of <var>approved\_users</var> before and after the <var>.split()</var> method is applied to it:

```python
approved_users = "elarson,bmoreno,tshah,sgilmore,eraab"
print("before .split():", approved_users)
approved_users = approved_users.split(",")
print("after .split():", approved_users)
```

```
before .split(): elarson,bmoreno,tshah,sgilmore,eraab
after .split(): ['elarson', 'bmoreno', 'tshah', 'sgilmore', 'eraab']
```

Before the <var>.split()</var> method is applied to <var>approved\_users</var>, it contains a string, but after it is applied, this string is converted to a list.

If you do not pass an argument into <var>.split()</var>, it will separate the string every time it encounters a whitespace.

**Note:** A variety of characters are considered whitespaces by Python. These characters include spaces between characters, returns for new lines, and others.

The following example demonstrates how a string of usernames that are separated by space can be split into a list through the <var>.split()</var> method:

```python
removed_users = "wjaffrey jsoto abernard jhill awilliam"
print("before .split():", removed_users)
removed_users = removed_users.split()
print("after .split():", removed_users)
```

```
before .split(): wjaffrey jsoto abernard jhill awilliam
after .split(): ['wjaffrey', 'jsoto', 'abernard', 'jhill', 'awilliam']
```

Because an argument isn't passed into <var>.split()</var>, Python splits the <var>removed\_users</var> string at each space when separating it into a list.

### **Applying .split() to files**

The <var>.split()</var> method allows you to work with file content as a list after you've converted it to a string through the <var>.read()</var> method. This is useful in a variety of ways. For example, if you want to iterate through the file contents in a <var>for</var> loop, this can be easily done when it's converted into a list.

The following code opens the <var>"update\_log.txt"</var> file. It then reads all of the file contents into the <var>updates</var> variable as a string and splits the string in the <var>updates</var> variable into a list by creating a new element at each whitespace:

```python
with open("update_log.txt", "r") as file:
    updates = file.read()
updates = updates.split()
```

After this, through the <var>updates</var> variable, you can work with the contents of the <var>"update\_log.txt"</var> file in parts of your code that require it to be structured as a list.

**Note:** Because the line that contains <var>.split()</var> is not indented as part of the <var>with</var> statement, the file closes first. Closing a file as soon as it is no longer needed helps maintain code readability. Once a file is read into the <var>updates</var> variable, it is not needed and can be closed.

## .join()

### **The basics of .join()**

If you need to convert a list into a string, there is also a method for that. The <var>.join()</var> method concatenates the elements of an iterable into a string. The syntax used with <var>.join()</var> is distinct from the syntax used with <var>.split()</var> and other methods that you've worked with, such as <var>.index()</var>.

In methods like <var>.split()</var> or <var>.index()</var>, you append the method to the string or list that you're working with and then pass in other arguments. For example, the code <var>usernames.index(2)</var>, appends the <var>.index()</var> method to the variable <var>usernames</var>, which contains a list. It passes in <var>2</var> as the argument to indicate which element to return.

However, with <var>.join()</var>, you must pass the list that you want to concatenate into a string in as an argument. You append <var>.join()</var> to a character that you want to separate each element with once they are joined into a string.

For example, in the following code, the <var>approved\_users</var> variable contains a list. If you want to join that list into a string and separate each element with a comma, you can use <var>",".join(approved\_users)</var>. Run the code and examine what it returns:

```python
approved_users = ["elarson", "bmoreno", "tshah", "sgilmore", "eraab"]
print("before .join():", approved_users)
approved_users = ",".join(approved_users)
print("after .join():", approved_users)
```

```
before .join(): ['elarson', 'bmoreno', 'tshah', 'sgilmore', 'eraab']
after .join(): elarson,bmoreno,tshah,sgilmore,eraab
```

Before <var>.join()</var> is applied, <var>approved\_users</var> is a list of five elements. After it is applied, it is a string with each username separated by a comma.

**Note**: Another way to separate elements when using the <var>.join()</var> method is to use <var>"\\n"</var>, which is the newline character. The <var>"\\n"</var> character indicates to separate the elements by placing them on new lines.

### **Applying .join() to files**

When working with files, it may also be necessary to convert its contents back into a string. For example, you may want to use the <var>.write()</var> method. The <var>.write()</var> method writes string data to a file. This means that if you have converted a file's contents into a list while working with it, you'll need to convert it back into a string before using <var>.write()</var>. You can use the <var>.join()</var> method for this.

You already examined how <var>.split()</var> could be applied to the contents of the <var>"update\_log.txt"</var> file once it is converted into a string through <var>.read()</var> and stored as <var>updates</var>:

```python
with open("update_log.txt", "r") as file:
    updates = file.read()
updates = updates.split()
```

After you're through performing operations using the list in the <var>updates</var> variable, you might want to replace <var>"update\_log.txt"</var> with the new contents. To do so, you need to first convert updates back into a string using <var>.join()</var>. Then, you can open the file using a <var>with</var> statement and use the <var>.write()</var> method to write the <var>updates</var> string to the file:

```python
updates = " ".join(updates)
with open("update_log.txt", "w") as file:
    file.write(updates)
```

The code <var>" ".join(updates)</var> indicates to separate each of the list elements in <var>updates</var> with a space once joined back into a string. And because <var>"w"</var> is specified as the second argument of <var>open()</var>, Python will overwrite the contents of <var>"update\_log.txt"</var> with the string currently in the <var>updates</var> variable.

## Key takeaways

An important element of working with files is being able to parse the data it contains. Parsing means converting the data into a readable format. The <var>.split()</var> and <var>.join()</var> methods are both useful for parsing data. The <var>.split()</var> method allows you to convert a string into a list, and the <var>.join()</var> method allows you to convert a list into a string.