- Published on
How to Build a Property Valuation Tool with a Real Estate API
How to Build a Property Valuation Tool with a Real Estate API
Automated property valuation is one of the most common use cases for real estate APIs. Banks use it for mortgage risk assessment. Portals use it to show "estimated value" badges on listings. Investors use it to find underpriced deals.
This tutorial walks through building a simple valuation tool that estimates a property's market value using comparable listings and area pricing data. We'll use Python and the Stream.estate API. The same approach works with any real estate API that exposes property search and market data endpoints.
What You Need
- Python 3.8+
- The
requestslibrary (pip install requests) - A Stream.estate API key (or adapt the concepts to any real estate API that provides comparable and market data)
The Valuation Approach
We're using the comparable sales method — the same foundation that real estate appraisers use in person. The logic is straightforward:
- Get the target property's details (type, size, location, rooms)
- Find similar properties in the same area with similar size, type, and room count
- Get the area's average price per square meter as a market anchor
- Calculate an estimate using the comparable prices
This is a simplified version of what AVMs (Automated Valuation Models) do. Professional AVMs use hundreds of features and ML models trained on years of transaction data. Our version uses the fundamentals: comparable properties and area averages. It won't match a professional appraisal, but it gives you a reasonable ballpark — and a foundation you can build on.
Step 1: Set Up the API Client
A small wrapper to keep the rest of the code clean:
import requests
from statistics import median, mean
API_KEY = "your-api-key"
BASE = "https://api.stream.estate"
HEADERS = {"X-API-KEY": API_KEY}
def api_get(endpoint, params=None):
resp = requests.get(f"{BASE}{endpoint}", headers=HEADERS, params=params)
resp.raise_for_status()
return resp.json()
Nothing fancy. Every request hits the same base URL with the same auth header. raise_for_status() will throw an exception on 4xx/5xx responses, which is fine for a tutorial — in production you'd want retry logic for rate limits.
Step 2: Get Target Property Details
Given a property UUID, pull its key characteristics. These are the attributes we'll use to find comparables.
def get_property(uuid):
data = api_get(f"/documents/properties/{uuid}")
return {
"uuid": data["uuid"],
"type": data["propertyType"],
"transaction": data["transactionType"],
"surface": data.get("surface"),
"rooms": data.get("rooms"),
"city": data.get("city"),
"postal_code": data.get("postalCode"),
"latitude": data.get("latitude"),
"longitude": data.get("longitude"),
}
We use .get() for fields that might not always be present. Some listings lack room counts or exact coordinates. Our valuation code will need to handle those gaps.
Step 3: Find Comparable Properties
This is the core of the valuation. You need properties that are similar enough to be useful comparisons but not so narrowly filtered that you end up with two results. There are two approaches.
A) Use the similar properties endpoint
The easiest option. The API does the matching for you:
def get_similar(uuid):
data = api_get(f"/documents/properties/{uuid}/similar")
return data.get("hydra:member", [])
This returns properties that the API considers similar based on location, type, and size. You don't control the matching criteria, but it's a fast way to get started.
B) Build your own comparable search
If you want more control over what counts as "comparable," query the search endpoint with explicit filters:
def find_comparables(property_info, surface_tolerance=0.3):
surface = property_info["surface"]
params = {
"city": property_info["city"],
"transactionType": "sale",
"propertyType": property_info["type"],
"surfaceMin": int(surface * (1 - surface_tolerance)),
"surfaceMax": int(surface * (1 + surface_tolerance)),
"itemsPerPage": 30,
}
if property_info.get("rooms"):
params["roomsMin"] = max(1, property_info["rooms"] - 1)
params["roomsMax"] = property_info["rooms"] + 1
data = api_get("/documents/properties", params)
return data.get("hydra:member", [])
Here we search for properties in the same city with the same type (apartment, house, etc.) and a surface area within 30% of the target. If we know the room count, we allow plus or minus one room. You can tighten or loosen these ranges depending on how many results you get — too few comparables and your estimate is noisy, too many and you're comparing unlike properties.
The surface_tolerance parameter is worth tuning. In dense urban markets with lots of listings, 0.2 (20%) works well. In rural areas where inventory is thin, you might need 0.4 or more to get enough comparables.
Step 4: Get Area Market Data
City-level price-per-meter data serves as an anchor for our estimate. If comparables are scarce, this gives us a fallback. If comparables are plentiful, it provides a sanity check.
def get_price_per_meter(city, transaction_type="sale"):
data = api_get("/indicators/price-per-meter", {
"city": city,
"transactionType": transaction_type,
})
return data
This endpoint returns the average asking price per square meter for a given city and transaction type. Keep in mind this is based on listing prices, not actual transaction prices. Listing prices tend to run higher than what buyers actually pay.
Step 5: Calculate the Estimate
Now we combine everything. The estimate averages two data points: the median price per m² from comparable properties, and the area-wide average price per m².
def estimate_value(target_uuid):
# Get target property
target = get_property(target_uuid)
if not target["surface"]:
return {"error": "No surface area available for this property"}
# Get comparable prices per m²
comparables = get_similar(target_uuid)
comp_prices_per_m2 = []
for comp in comparables:
adverts = comp.get("adverts", [])
if adverts and comp.get("surface"):
price = adverts[0].get("price")
if price and comp["surface"] > 0:
comp_prices_per_m2.append(price / comp["surface"])
# Get area average
area_data = get_price_per_meter(target["city"])
# Build estimate from available data points
estimates = []
if comp_prices_per_m2:
comp_median = median(comp_prices_per_m2)
estimates.append(comp_median * target["surface"])
if area_data:
area_price = area_data.get("pricePerMeter")
if area_price:
estimates.append(area_price * target["surface"])
if not estimates:
return {"error": "Not enough data to estimate"}
return {
"property": target,
"estimated_value": round(mean(estimates)),
"comparables_used": len(comp_prices_per_m2),
"comparable_median_per_m2": round(median(comp_prices_per_m2)) if comp_prices_per_m2 else None,
"area_avg_per_m2": area_data.get("pricePerMeter"),
}
A few things to note:
- We use median for comparable prices, not mean. A single outlier (a luxury renovation listed at 2x the market) won't skew the result.
- We average the comparable-based estimate with the area average. This keeps the result anchored when comparables are few or skewed. If you have many good comparables, you might want to weight them more heavily — say 70/30 instead of 50/50.
- The function returns both the final estimate and the underlying data points. This matters. A valuation number without context is useless. Knowing you had 15 comparables at €4,200/m² vs. an area average of €3,800/m² tells you a lot more than just "€273,000."
Step 6: Use It
result = estimate_value("some-property-uuid")
if "error" in result:
print(f"Could not estimate: {result['error']}")
else:
print(f"Estimated value: €{result['estimated_value']:,}")
print(f"Based on {result['comparables_used']} comparable properties")
print(f"Comparable median: €{result['comparable_median_per_m2']:,}/m²")
print(f"Area average: €{result['area_avg_per_m2']:,}/m²")
For a 65 m² apartment in Lyon with 12 comparables, the output might look like:
Estimated value: €247,000
Based on 12 comparable properties
Comparable median: €3,920/m²
Area average: €3,680/m²
That spread between comparable median and area average is useful information. If comparables are significantly higher than the area average, the property might be in a premium neighborhood, or the comparables might be biased toward recently renovated units.
Limitations and Where This Falls Short
This tool gives you a starting point, not a finished appraisal. Here's what it misses:
- Floor and orientation. A ground-floor apartment facing a courtyard and a top-floor unit with a south-facing balcony in the same building can differ by 20% or more. Our model treats them the same.
- Property condition. Renovated vs. needs-a-full-gut is a massive factor. Listing data rarely captures this reliably. Some listings mention "refait à neuf" in the description, most don't.
- Hyperlocal pricing. Price per m² can vary block by block, especially in cities with distinct micro-neighborhoods. City-level averages smooth over these differences entirely.
- Market timing. In a fast-moving market, listings from three months ago may reflect a different price reality. Our tool treats all active listings equally regardless of when they appeared.
- Asking vs. selling price. Everything here is based on listing prices. Actual transaction prices — what buyers paid after negotiation — are often 5-10% lower. In France, the DVF dataset provides actual transaction data, but with a lag of several months.
Improving the Tool
If you want to take this further, here are concrete next steps:
- Weight comparables by proximity. Calculate the distance between the target and each comparable. Properties on the same street should count more than ones across town. A simple inverse-distance weighting goes a long way.
- Filter by recency. Only use listings from the last 3-6 months. Drop anything older. Markets move.
- Add more features. If the API provides floor number, parking, balcony, or energy rating data, factor those in. Each additional feature narrows the comparable set and improves accuracy.
- Combine with DVF data. In France, the DVF (Demandes de Valeurs Foncières) dataset publishes actual transaction prices. Cross-referencing API listing data with DVF sale prices gives you ground truth to calibrate against. Our French Property Data API guide covers the DVF dataset and other French-specific data sources in detail.
- Train an ML model. Once you have enough data, a gradient boosted model (XGBoost, LightGBM) trained on property features and sale prices will outperform the manual comparable approach. But you need volume — hundreds of transactions in each market segment — for the model to generalize.
Wrapping Up
Property valuation from API data is doable and useful, but it's not magic. The quality of your estimate depends entirely on the quality and quantity of your comparables. In dense urban markets with lots of active listings, you'll get reasonable estimates. In markets with sparse data, the numbers will be rough.
The code in this tutorial is a foundation. The real work starts when you tune it: adjusting tolerance ranges, weighting by distance and recency, and validating against actual transaction data.
This tutorial uses the Stream.estate API. For the full setup guide, see our Developer Integration Guide. For background on how real estate APIs work, start with What Is a Real Estate Data API?. To compare API providers and understand what each one costs, check our Real Estate API Pricing Comparison.