YAML is a human readable data serialization language. That means we can serialize, store or send the objects or the data structure over the network for other API or computers to consume. Which is mostly used as configuration files and as a replacement for JSON files. YAML is the superset of the JSON.
That means in a YAML file you can have JSON. JSON can also be transformed to YAML files. YAML syntax is independent of a specific programming language.
There are certain thing to keep in mind while working with YAML files.
- Yaml is case sensitive
- Yaml supports spaces for indentation. It does not allow the use of Tabs wile creating a YAML files.
How data is stored in YAML files ?
Data are represented in two ways in YAML files.
As a sequence, which starts with a dash and is then followed by space. This is analogous to python lists.
- Apple
- Orange
- Grapes
- Mango
As a map, which is a key and value pair, the structure is analogous to a python dictionary. The values can span multiple lines using |
or >
# we are seeing key value pair, also a nested key value relationship
Name: Danish Xavier
City: Ontario
Address:
street: |
123 Tornado Alley
Suite 16
State: MS
- We should also remember that all YAML documents can begin with three Hyphens --- and end with three dots ...
---
# Document 1
Name: Danish Xavier
City: Ontario
Address:
street: |
123 Tornado Alley
Suite 16
State: MS
...
---
# Document 2
Name: Jon Doe
City: Toronto
Address:
street: |
344 Briarwood Road
Springfield, Boulevard
State: KS
...
Reading Yaml Files into Python
We will use the pyyaml library to work with the YAML files
we can install it using the following command.
pip install pyyaml
You can find the documentation for the library here.
Create a YAML file
We will create a YAML file and parse that using the pypyaml library to work with the code.
We will name the file as configuration.yaml
Lets add in some configuration details we wanted to add to it.
---
Application:
app_name: my_first_app
version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
db_uri: mongodb://sysop:moon@localhost
active_ports:
- 2003
- 7890
- 0987
file_path: /home/app/
...
Reading the YAML file in Python
Let’s read the YAML file configuration.yaml into python.
import yaml
# reading the file from the same directory
with open('configurations.yaml','r') as read_:
data = yaml.load(read_, Loader=yaml.SafeLoader)
print(data)
print(type(data))
Output:
{
"Application": {"app_name": "my_first_app", "version": "0.1.3"},
"app_server_ip": "http://0.0.0.0:2003/",
"db_uri": "mongodb://sysop:moon@localhost",
"active_ports": [2003, 7890, 0987],
"file_path": "/home/app/",
}
- Remember we need to use the Loader parameter as, without that, it presents a security vulnerability in parsing yaml files.
Options for loaders
BaseLoader:
Only loads the most basic YAML. All scalars are loaded as strings.
SafeLoader:
Loads a subset of the YAML language safely. This is recommended for loading untrusted input.
FullLoader:
It loads the whole YAML language. Avoids arbitrary code execution.
Loading Multiple YAML Documents
As discussed above, we can have multiple documents in a YAML file. We can load them into python using the load_all method, which parses and returns a generator object. Wich we convert to a list.
Lets load the bottom sample.yaml file given below into python.
---
Application Instance:
app_name: my_first_app
version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
...
---
Application Instance:
app_name: my_second_app
version: 0.1.1
app_server_ip: http://0.0.0.0:2002/
...
import yaml
# reading multiple documents from a single yaml file
with open('sample.yaml','r') as read_:
data = yaml.load_all(read_, Loader=yaml.SafeLoader)
# since load_all() methods returns a generator
data = list(data)
print(data)
print(type(data))
Output:
[
{
"Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
"app_server_ip": "http://0.0.0.0:2003/",
},
{
"Application Instance": {"app_name": "my_second_app", "version": "0.1.1"},
"app_server_ip": "http://0.0.0.0:2002/",
},
]
Python write into YAML file
Often, we may need to rewrite a configuration file in the runtime or change the yaml file. We can do this by the yaml.dump
method.
In the code below we are going to create an output_1.yaml
file which will be a single document.
import yaml
my_data = {
"Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
"app_server_ip": "http://0.0.0.0:2003/",
"open_ports": [9087, 765, 2003],
}
with open('output.yaml', 'w') as output_:
yaml.dump(my_data, output_)
Output:
Application Instance:
app_name: my_first_app
version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
open_ports:
- 9087
- 765
- 2003
Python write multiple Documents into a single YAML file
We will now write two python objects into a single YAML file output_2.yaml. We will be putting both the python objects in a list and using the dump_all method.
import yaml
my_data_1 = {
"Application Instance": {"app_name": "my_first_app", "version": "0.1.3"},
"app_server_ip": "http://0.0.0.0:2003/",
"open_ports": [9087, 765, 2003],
}
my_data_2 = {
"Application Instance": {"app_name": "second_app", "version": "0.1.0"},
"app_server_ip": "http://0.0.0.0:2001/",
"open_ports": [80, 8080, 1003],
}
single_object = [my_data_1,my_data_2]
with open('output_2.yaml', 'w') as output_:
yaml.dump_all(single_object, output_)
Output:
Application Instance:
app_name: my_first_app
version: 0.1.3
app_server_ip: http://0.0.0.0:2003/
open_ports:
- 9087
- 765
- 2003
---
Application Instance:
app_name: second_app
version: 0.1.0
app_server_ip: http://0.0.0.0:2001/
open_ports:
- 80
- 8080
- 1003
Convert YAML to JSON
We will now load a YAML file, convert it to JSON, and store it in the disk as a JSON file.
import yaml
import json
# reading multiple documents from a single yaml file
with open('sample.yaml','r') as read_:
data = yaml.load_all(read_, Loader=yaml.SafeLoader)
# since load_all() methods returns a generator
data = list(data)
# Write YAML object to JSON format
with open('sample.json', 'w') as sam_json:
json.dump(data, sam_json, sort_keys=False)
Output:
[
{
"Application Instance":{
"app_name":"my_first_app",
"version":"0.1.3"
},
"app_server_ip":"http://0.0.0.0:2003/"
},
{
"Application Instance":{
"app_name":"my_second_app",
"version":"0.1.1"
},
"app_server_ip":"http://0.0.0.0:2002/"
}
]