YAML is a human readable data serialization language. That means we can serialize, store or send the objects or the data structure over the network for other API or computers to consume. Which is mostly used as configuration files and as a replacement for JSON files. YAML is the superset of the JSON.

That means in a YAML file you can have JSON. JSON can also be transformed to YAML files. YAML syntax is independent of a specific programming language.

There are certain thing to keep in mind while working with YAML files.

How data is stored in YAML files ?

Data are represented in two ways in YAML files.

As a sequence, which starts with a dash and is then followed by space. This is analogous to python lists.

				
					- Apple
- Orange
- Grapes
- Mango
				

As a map, which is a key and value pair, the structure is analogous to a python dictionary. The values can span multiple lines using | or >

				
					# we are seeing key value pair, also a nested key value relationship
Name: Danish Xavier
City: Ontario
Address: 
    street: |
        123 Tornado Alley
        Suite 16
    State: MS
				
				
					---
# Document 1
Name: Danish Xavier
City: Ontario
Address: 
    street: |
        123 Tornado Alley
        Suite 16
    State: MS
...
---
# Document 2
Name: Jon Doe
City: Toronto
Address:
    street: |
        344 Briarwood Road
        Springfield, Boulevard
    State: KS
...
        
				

Reading Yaml Files into Python

We will use the pyyaml library to work with the YAML files
we can install it using the following command.

				
					pip install pyyaml
				

You can find the documentation for the library here.

Create a YAML file

We will create a YAML file and parse that using the pypyaml library to work with the code.

We will name the file as configuration.yaml

Lets add in some configuration details we wanted to add to it.

				
					---
Application:
    app_name: my_first_app
    version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
db_uri: mongodb://sysop:moon@localhost
active_ports:
            - 2003
            - 7890
            - 0987
file_path: /home/app/
...
            
				

Reading the YAML file in Python

Let’s read the YAML file configuration.yaml into python.

				
					import yaml
# reading the file from the same directory
with open('configurations.yaml','r') as read_:
	data = yaml.load(read_, Loader=yaml.SafeLoader)
print(data)
print(type(data))

				

Output:

				
					{
    "Application": {"app_name": "my_first_app", "version": "0.1.3"},
    "app_server_ip": "http://0.0.0.0:2003/",
    "db_uri": "mongodb://sysop:moon@localhost",
    "active_ports": [2003, 7890, 0987],
    "file_path": "/home/app/",
}
<class 'dict'>
				

Options for loaders

BaseLoader:

Only loads the most basic YAML. All scalars are loaded as strings.

SafeLoader:

Loads a subset of the YAML language safely. This is recommended for loading untrusted input.

FullLoader:

It loads the whole YAML language. Avoids arbitrary code execution.

Loading Multiple YAML Documents

As discussed above, we can have multiple documents in a YAML file. We can load them into python using the load_all method, which parses and returns a generator object. Wich we convert to a list.

Lets load the bottom sample.yaml file given below into python.

				
					---
Application Instance:
    app_name: my_first_app
    version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
...
---
Application Instance:
    app_name: my_second_app
    version: 0.1.1
app_server_ip: http://0.0.0.0:2002/                
...
      
				
				
					import yaml
# reading multiple documents from a single yaml file
with open('sample.yaml','r') as read_:
	data = yaml.load_all(read_, Loader=yaml.SafeLoader)
	# since load_all() methods returns a generator
	data = list(data)
print(data)
print(type(data))
				

Output:

				
					[
    {
        "Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
        "app_server_ip": "http://0.0.0.0:2003/",
    },
    {
        "Application Instance": {"app_name": "my_second_app", "version": "0.1.1"},
        "app_server_ip": "http://0.0.0.0:2002/",
    },
]
<class 'list'>
				

Python write into YAML file

Often, we may need to rewrite a configuration file in the runtime or change the yaml file. We can do this by the yaml.dump method.

In the code below we are going to create an output_1.yaml file which will be a single document.

				
					import yaml
my_data = {
    "Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
    "app_server_ip": "http://0.0.0.0:2003/",
    "open_ports": [9087, 765, 2003],
}
with open('output.yaml', 'w') as output_:
	yaml.dump(my_data, output_)

				

Output:

				
					Application Instance:
  app_name: my_first_app
  version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
open_ports:
- 9087
- 765
- 2003

				

Python write multiple Documents into a single YAML file

We will now write two python objects into a single YAML file output_2.yaml. We will be putting both the python objects in a list and using the dump_all method.

				
					import yaml
my_data_1 = {
    "Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
    "app_server_ip": "http://0.0.0.0:2003/",
    "open_ports": [9087, 765, 2003],
}
my_data_2 = {
    "Application Instance": {"app_name": "second_app", "version": "0.1.0"},
    "app_server_ip": "http://0.0.0.0:2001/",
    "open_ports": [80, 8080, 1003],
}
single_object = [my_data_1,my_data_2]
with open('output_2.yaml', 'w') as output_:
	yaml.dump_all(single_object, output_)
				

Output:

				
					Application Instance:
  app_name: my_first_app
  version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
open_ports:
- 9087
- 765
- 2003
---
Application Instance:
  app_name: second_app
  version: 0.1.0
app_server_ip: http://0.0.0.0:2001/
open_ports:
- 80
- 8080
- 1003

				

Convert YAML to JSON

We will now load a YAML file, convert it to JSON, and store it in the disk as a JSON file.

				
					import yaml
import json
# reading multiple documents from a single yaml file
with open('sample.yaml','r') as read_:
	data = yaml.load_all(read_, Loader=yaml.SafeLoader)
	# since load_all() methods returns a generator
	data = list(data)
# Write YAML object to JSON format
with open('sample.json', 'w') as sam_json:
    json.dump(data, sam_json, sort_keys=False)

				

Output:

				
					[
   {
      "Application Instance":{
         "app_name":"my_first_app",
         "version":"0.1.3"
      },
      "app_server_ip":"http://0.0.0.0:2003/"
   },
   {
      "Application Instance":{
         "app_name":"my_second_app",
         "version":"0.1.1"
      },
      "app_server_ip":"http://0.0.0.0:2002/"
   }
]