Custom Validators¶
This guide shows you how to implement custom validation logic that goes beyond simple field constraints. You'll learn to write cross-field validators, conditional logic, and complex business rules that work seamlessly across both Pydantic and Polars.
๐ Quick Reference¶
from datetime import datetime
from flycatcher import Schema, Field, col, model_validator
class BookingSchema(Schema):
check_in: datetime
check_out: datetime
guests: int = Field(ge=1)
@model_validator
def check_dates():
return (
col('check_out') > col('check_in'),
"Check-out must be after check-in"
)
Key Functions:
col(field_name)- Reference a field in validation expressions@model_validator- Decorator for cross-field validation functions
๐ง Understanding the col() DSL¶
The col() function creates field references that compile to both Polars expressions and Python callables. This lets you write validation logic once and have it work in both row-level (Pydantic) and bulk (Polars) contexts.
Basic Field References¶
Comparison Operations¶
# Numeric comparisons
col('age') >= 18
col('price') > 0
col('discount_price') < col('regular_price')
# Equality checks
col('status') == 'active'
col('email') != None
# Date/time comparisons
col('end_date') > col('start_date')
String Operations¶
# String methods
col('email').str.contains('@')
col('email').str.ends_with('.com')
col('name').str.starts_with('Dr.')
col('tag').str.len_chars() <= 20
# Pattern matching
col('phone').str.contains(r'^\d{3}-\d{3}-\d{4}$')
Logical Operations¶
# AND
(col('age') >= 18) & (col('age') <= 65)
# OR
(col('type') == 'premium') | (col('type') == 'enterprise')
# NOT
~(col('status') == 'deleted')
Null Checks¶
Membership Operations¶
# Membership with optional null matching
col('country').is_in(['US', 'CA'])
col('country').is_in([None, 'CA'], nulls_equal=True)
# Between with configurable interval closure
col('age').is_between(18, 65) # inclusive on both sides
col('score').is_between('min_score', 'max_score', closed='right')
is_inaccepts a sequence or Series; setnulls_equal=Trueto treatNoneas a distinct value instead of propagating nulls.is_betweenaccepts expressions for bounds (strings are parsed as column names) and supportsclosed='both' | 'left' | 'right' | 'none'.
Numeric & Math Operations¶
For numeric fields, you can use Polars-style math helpers that work in both Polars and Pydantic contexts:
# Absolute value (already available)
col('delta').abs() > 0
# Rounding
col('price').round(2) # 2 decimal places
col('score').round() # to integer
# Floor / ceil
col('score').floor() >= 0
col('ratio').ceil() <= 10
# Square root
col('variance').sqrt() <= 5
# Power
col('amount').pow(2) <= col('limit')
- The DSL mirrors the Polars API where possible (
round(decimals=0),.floor(),.ceil(),.sqrt(),.pow(exponent)). - On the Python side, these use the standard library (
round,math) so behavior is consistent with built-in Python and no extra dependencies are required.
Date/Time Operations¶
# Extract components
col('created_at').dt.year() == 2024
col('created_at').dt.month() == 6 # June
col('created_at').dt.day() >= 15
col('created_at').dt.hour() < 18 # Before 6 PM
col('created_at').dt.minute() == 0 # On the hour
col('created_at').dt.second() == 0 # On the minute
# Date difference (returns float days)
col('check_out').dt.total_days(col('check_in')) >= 2 # Minimum 2 nights
col('created_at').dt.total_days(datetime(2024, 1, 1)) > 0 # After Jan 1, 2024
๐
โโ๏ธ Cross-Field Validation with @model_validator¶
Use @model_validator to implement validation rules that depend on multiple fields.
Basic Cross-Field Validation¶
Example: Price Comparison
from flycatcher import Schema, Field, col, model_validator
class ProductSchema(Schema):
regular_price: float = Field(gt=0)
sale_price: float | None = Field(default=None, gt=0, nullable=True)
@model_validator
def check_sale_price():
"""Sale price must be less than regular price."""
return (
col('sale_price') < col('regular_price'),
"Sale price must be less than regular price"
)
How it works:
- The function returns a tuple:
(condition_expression, error_message) - The condition should evaluate to
Truefor valid data - Flycatcher compiles this to both Pydantic validators and Polars expressions
Date Range Validation¶
Example: Event Booking
from datetime import datetime
from flycatcher import Schema, Field, col, model_validator
class EventSchema(Schema):
start_time: datetime
end_time: datetime
duration_hours: int = Field(ge=1, le=12)
@model_validator
def check_end_after_start():
"""End time must be after start time."""
return (
col('end_time') > col('start_time'),
"Event must end after it starts"
)
Conditional Validation¶
Example: Discount Rules
from flycatcher import Schema, Field, col, model_validator
class OrderSchema(Schema):
item_type: str
quantity: int = Field(ge=1)
unit_price: float = Field(gt=0)
discount_percent: float = Field(ge=0, le=100, default=0)
@model_validator
def bulk_discount_rule():
"""Orders of 10+ items get at least 10% discount."""
return (
(col('quantity') < 10) | (col('discount_percent') >= 10),
"Bulk orders (10+) must have at least 10% discount"
)
Understanding the logic:
col('quantity') < 10- Small orders can have any discount|(OR) - Either condition can be truecol('discount_percent') >= 10- Large orders must have discount- Result: Small orders pass automatically, large orders need the discount
Multiple Validators¶
You can add multiple validators to a single schema:
from datetime import datetime
from flycatcher import Schema, Field, col, model_validator
class BookingSchema(Schema):
check_in: datetime
check_out: datetime
guests: int = Field(ge=1)
room_type: str
@model_validator
def check_dates():
"""Check-out must be after check-in."""
return (
col('check_out') > col('check_in'),
"Check-out date must be after check-in date"
)
@model_validator
def check_guest_capacity():
"""Rooms have max 4 guests, suites have max 8."""
return (
(
((col('room_type') == 'room') & (col('guests') <= 4))
| ((col('room_type') == 'suite') & (col('guests') <= 8))
),
"Room capacity exceeded (rooms: 4, suites: 8)"
)
@model_validator
def check_minimum_stay():
"""Weekend bookings require 2+ night minimum."""
# For weekends (check-in on Fri/Sat), require 2+ nights
# Note: weekday() not yet available in DSL, use explicit Polars for now
return (
col('check_out').dt.total_days(col('check_in')) >= 2,
"Minimum stay is 2 nights"
)
๐ Handling Nullable Fields¶
When validating optional fields, use the | (OR) operator to skip validation when the field is None:
from flycatcher import Schema, Field, col, model_validator
class ProductSchema(Schema):
regular_price: float = Field(gt=0)
sale_price: float | None = Field(default=None, gt=0, nullable=True) # Optional field
@model_validator
def check_sale_price():
"""If sale price is provided, it must be less than regular price."""
return (
col('sale_price').is_null() # Skip if not provided
| (col('sale_price') < col('regular_price')), # Validate if provided
"Sale price must be less than regular price when provided"
)
Pattern: col('field').is_null() | (validation_condition)
This ensures validation only applies when the optional field has a value.
๐งช Testing Your Validators¶
Test with Pydantic (Row-Level)¶
from datetime import datetime
import pytest
from pydantic import ValidationError
# Generate Pydantic model
Booking = BookingSchema.to_pydantic()
# Valid booking
valid_booking = Booking(
check_in=datetime(2024, 6, 1),
check_out=datetime(2024, 6, 5),
guests=2
)
assert valid_booking.guests == 2
# Invalid booking (check-out before check-in)
with pytest.raises(ValidationError) as exc_info:
invalid_booking = Booking(
check_in=datetime(2024, 6, 5),
check_out=datetime(2024, 6, 1),
guests=2
)
assert "Check-out date must be after check-in date" in str(exc_info.value)
Test with Polars (Bulk)¶
import polars as pl
from datetime import datetime
# Generate Polars validator
BookingValidator = BookingSchema.to_polars_validator()
# Valid data
valid_df = pl.DataFrame({
"check_in": [datetime(2024, 6, 1), datetime(2024, 7, 1)],
"check_out": [datetime(2024, 6, 5), datetime(2024, 7, 5)],
"guests": [2, 4]
})
validated = BookingValidator.validate(valid_df, strict=True)
assert len(validated) == 2
# Invalid data
invalid_df = pl.DataFrame({
"check_in": [datetime(2024, 6, 5)],
"check_out": [datetime(2024, 6, 1)], # Before check-in!
"guests": [2]
})
with pytest.raises(Exception) as exc_info:
BookingValidator.validate(invalid_df, strict=True)
assert "Check-out date must be after check-in date" in str(exc_info.value)
๐ Troubleshooting¶
Validator Not Running¶
Problem: Your validator doesn't seem to execute.
Solution: Ensure it's decorated with @model_validator and returns a tuple:
# โ Missing decorator
def my_check():
return (col('x') > 0, "Error")
# โ
Correct
@model_validator
def my_check():
return (col('x') > 0, "Error")
Type Errors in Polars¶
Problem: TypeError when validating with Polars.
Solution: Ensure your DSL expressions use Polars-compatible operations:
# โ Python string method (doesn't work in Polars)
col('email').contains('@')
# โ
Polars string method
col('email').str.contains('@')
Validation Too Strict¶
Problem: Valid data is being rejected.
Solution: Add logging to see which condition is failing:
@model_validator
def check_prices():
return (
col('sale_price') < col('regular_price'),
f"Sale price must be less than regular price"
)
# Test with show_violations to see failing rows
validator.validate(df, strict=True, show_violations=True)
๐พ Next Steps¶
- Tutorials - First Schema - Learn the basics
Need Help?¶
- ๐ฌ Ask in GitHub Discussions
- ๐ Report a Bug
- ๐ Read the Full Documentation