Chapter 2 — Data Models

Tutorial

Quick Recap

Taking a brief look back at our progress, in the initial chapter, we successfully crafted our first test scenario for the chat service. The test's purpose was to confirm the proper functionality of the user registration process.

import vedro
import httpx

API_URL = "https://chat-api-tutorial.vedro.io/dkmaxmqj8b"

class Scenario(vedro.Scenario):
    subject = "register new user"

    # Arrange step: prepare the necessary data for the test
    def given_new_user(self):
        self.user = {"username": "bob", "password": "qweqwe"}

    # Act step: perform the primary action
    def when_guest_registers(self):
        self.response = httpx.post(f"{API_URL}/auth/register", json=self.user)

    # Assert step: verify that the system behaved as expected
    def then_it_should_return_success_response(self):
        assert self.response.status_code == 200

Now, it's time to enhance our test while exploring best practices for crafting automated tests.

Introducing Data Models

When we run the above test a second time, it fails because the username "bob" is already registered.

$ vedro run

Scenarios
* 
 ✗ register new user
   ✔ given_new_user
   ✔ when_guest_registers
   ✗ then_it_should_return_success_response
╭─────────────────────────── Traceback (most recent call last) ─────────────────────────╮
│ ./scenarios/first_scenario.py:17 in then_it_should_return_success_response            │
│                                                                                       │
│   14 │   │   self.response = httpx.post(f"{API_URL}/auth/register", json=self.user)   │
│   15 │                                                                                │
│   16 │   def then_it_should_return_success_response(self):                            │
│ ❱ 17 │   │   assert self.response.status_code == 200                                  │
│   18                                                                                  │
╰───────────────────────────────────────────────────────────────────────────────────────╯
AssertionError: assert 400 == 200
 +  where 400 = <Response [400 Bad Request]>.status_code
 
 
# 1 scenario, 0 passed, 1 failed, 0 skipped (0.28s)

This issue arises due to the hardcoded data used in our test. To maintain the independence of each test, we must introduce variability, achievable through data models. In this context, we will use the d42 library to define, generate, validate, and substitute data based on the models we design.

Let's compare hardcoded data and a data model.

Hardcoded Data:

username = "bob"
password = "qweqwe"

Data Model:

from string import ascii_lowercase
from d42 import schema

NewUserSchema = schema.dict({
    "username": schema.str.alphabet(ascii_lowercase).len(3, 12),
    "password": schema.str.len(6, ...),
})

In this data model:

username is a string containing between 3 to 12 lowercase letters
password is a string with a minimum of 6 characters

(these specifications are based on the method documentation available at chat-api-tutorial.vedro.io/docs)

Data Generation

Our new data model allows the generation of unique data for each test:

from d42 import fake

fake(NewUserSchema)
# {'username': 'mwpd', 'password': 'EMiqcS2L9 x6UgxUuirjT9'}

fake(NewUserSchema)
# {'username': 'kqnhsrqito', 'password': 'XXlYxBaiXAvzj5Yp9pdR'}

fake(NewUserSchema)
# {'username': 'tzybe', 'password': 'Hr67Wxm6WLLLkhHFJm3SjA'}

Implementing this in our test scenario eliminates the problem of data dependency:

import vedro
import httpx
from d42 import fake
from schemas.user import NewUserSchema

class Scenario(vedro.Scenario):
    subject = "register new user"

    def given_new_user(self):
        self.user = fake(NewUserSchema)

    def when_guest_registers(self):
        self.response = httpx.post(f"{API_URL}/auth/register", json=self.user)

    def then_it_should_return_success_response(self):
        assert self.response.status_code == 200

info

To keep our data models organized, we should save them in the schemas/ directory. In this case, we have created a file named user.py inside the schemas directory and placed the NewUserSchema definition there.

Data Validation

The beauty of data models is their ability not only to generate data but also validate it. The validation process ensures that the received response fits our defined data model:

🍏 OK
🍎 Incorrect Username
🍎 Incorrect Password

response_body = {
    "username": "bob",
    "password": "qweqwe"
}
assert response_body == NewUserSchema

# No Errors

response_body = {
    "username": "x",
    "password": "qweqwe"
}
assert response_body == NewUserSchema

# valera.ValidationException:
#  - Value <class 'str'> at _['username'] must have at least 3 elements, but it has 1 element

response_body = {
    "username": "alice",
    "pass": "qweqwe"
}
assert response_body == NewUserSchema

# valera.ValidationException:
#  - Key _['password'] does not exist

This validation step ensures that the response has the correct structure and field types:

import vedro
import httpx
from d42 import fake
from schemas.user import NewUserSchema

class Scenario(vedro.Scenario):
    subject = "register new user"

    def given_new_user(self):
        self.user = fake(NewUserSchema)

    def when_guest_registers(self):
        self.response = httpx.post(f"{API_URL}/auth/register", json=self.user)

    def then_it_should_return_success_response(self):
        assert self.response.status_code == 200

    def and_then_it_should_return_created_user(self):
        assert self.response.json() == NewUserSchema

The test now checks not only that the username and password fields exist and are strings, but also that they meet the criteria defined in our data model.

For even more granular validation, we can refine the schema by substituting our generated values. This allows us to validate not just the type, but also the specific values of the fields:

NewUserSchema % {
    "username": "bob",
    "password": "qweqwe",
}

This will substitute the values using the % operator, similar to printf-style string formatting in Python. The result of the substitution will be a refined schema:

Substituted
Original

schema.dict({
    'username': schema.str('bob').alphabet(ascii_lowercase).len(3, 12),
    'password': schema.str('qweqwe').len(6, ...)
})

schema.dict({
    'username': schema.str.alphabet(ascii_lowercase).len(3, 12),
    'password': schema.str.len(6, ...),
})

We can apply this refinement to our test scenario:

import vedro
import httpx
from d42 import fake
from schemas.user import NewUserSchema

class Scenario(vedro.Scenario):
    subject = "register new user"

    def given_new_user(self):
        self.user = fake(NewUserSchema)

    def when_guest_registers(self):
        self.response = httpx.post(f"{API_URL}/auth/register", json=self.user)

    def then_it_should_return_success_response(self):
        assert self.response.status_code == 200

    def and_then_it_should_return_created_user(self):
        assert self.response.json() == NewUserSchema % {
            "username": self.user["username"],
            "password": self.user["password"],
        }

Or simply:

    ...

    def and_then_it_should_return_created_user(self):
        assert self.response.json() == NewUserSchema % self.user

Wrap-up

In this chapter, we have successfully enhanced our test by incorporating data models. This not only makes our tests more robust by eliminating data dependency but also makes them easier to maintain and extend.

In the next chapters, we'll dive deeper into advanced test scenarios, explore test organization, and further examine best practices in test automation.

Quick Recap​

Introducing Data Models​

Data Generation​

Data Validation​

Wrap-up​

Quick Recap

Introducing Data Models

Data Generation

Data Validation

Wrap-up