This project demonstrates how to use the Geoapify Geocoding API to geocode addresses from an input file and generate a validation report based on confidence levels.
It is based on the original Geocode Example but adds logic to classify addresses as CONFIRMED, PARTIALLY_CONFIRMED, or NOT_CONFIRMED.
Make sure the following are installed:
- Python 3.11 or higher
pip(Python package manager)
git clone https://geoapify.github.io/maps-api-code-samples/
cd maps-api-code-samples/python/python -m venv env
source env/bin/activate # On Windows: env\Scripts\activatepython address_verification.py \
--api_key YOUR_API_KEY \
--input input.txt \
--output geocoded.ndjson \
--validation_output address_validation.csv \
--min_confirmed 0.85 \
--max_not_confirmed 0.4| Argument | Required | Description |
|---|---|---|
--api_key |
Yes | Your Geoapify API key. |
--input |
Yes | Input file with one address per line. |
--output |
Yes | Output file for geocoding results (NDJSON format). |
--validation_output |
Yes | Output CSV file with validation results. |
--country_code |
No | Restrict geocoding to a specific country (e.g., us, de, fr). |
--min_confirmed |
No | Minimum confidence to be considered CONFIRMED (default: 0.9). |
--max_not_confirmed |
No | Maximum confidence to be considered NOT_CONFIRMED (default: 0.5). |
Each geocoded result is evaluated using confidence scores provided by the API:
rank.confidencerank.confidence_city_levelrank.confidence_street_levelrank.confidence_building_level
CONFIRMED– if confidence ≥--min_confirmedNOT_CONFIRMED– if confidence ≤--max_not_confirmedPARTIALLY_CONFIRMED– if confidence is in between
For PARTIALLY_CONFIRMED cases, a reason is provided:
If geocoding fails, the result is NOT_CONFIRMED with reason: "No geocoding result".
- Contains raw API responses, one per line.
- Useful for debugging or reprocessing later.
Original Address,Validation Result,Reason
"1600 Amphitheatre Parkway, Mountain View, CA 94043, USA",CONFIRMED,
"Unknown Street, Nowhere",NOT_CONFIRMED,No geocoding result
"Main St, Smalltown",PARTIALLY_CONFIRMED,CITY_LEVEL_DOUBTS
"456 Example St, Springfield",PARTIALLY_CONFIRMED,LOW_STREET_LEVEL_CONFIDENCE
def geocode_addresses(api_key, addresses, output_file, country_code):
# Split addresses into batches
addresses = list(it.batched(addresses, REQUESTS_PER_SECOND))
# Request results asynchronously for each address batch
tasks = []
with ThreadPoolExecutor(max_workers=10) as executor:
for batch in addresses:
logger.info(batch)
tasks.extend([executor.submit(geocode_address, address, api_key, country_code) for address in batch])
sleep(1)
# Wait for results
wait(tasks, return_when=ALL_COMPLETED)
return [task.result() for task in tasks]Sends batches of addresses to the Geoapify Geocoding API using multithreading and rate-limiting, then collects the results.
- Batches the addresses into groups of 5 using
itertools.batched(), to comply with the Geoapify Free plan (5 requests/second). - Processes batches in parallel using
ThreadPoolExecutor. - Sleeps 1 second between batches to stay within the rate limit.
- Returns a list of geocoding results, preserving the input order.
def generate_validation_report(addresses, geocode_results, min_confirmed, max_not_confirmed, output):
# write csv with validation results
with open(output, 'w', newline='') as f:
fieldnames = ['Original Address', 'Validation Result', 'Reason']
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
for address, result in zip(addresses, geocode_results):
stats = validate_address_geocoding(result, min_confirmed, max_not_confirmed)
writer.writerow({'Original Address': address,
'Validation Result': stats[0],
'Reason': stats[1]})Generates a CSV validation report that labels each address as:
CONFIRMEDPARTIALLY_CONFIRMEDNOT_CONFIRMED
…and includes the reason for partial or failed validation.
- Opens the CSV file and sets up a header:
"Original Address", "Validation Result", "Reason" - Iterates through all input addresses and their corresponding geocoding results.
- Calls
validate_address_geocoding(...)to evaluate confidence levels. - Writes one row per address with the classification and reasoning.
def validate_address_geocoding(geocode_result, min_confirmed, max_not_confirmed):
# Set Not confirmed for result with errors
if not geocode_result or geocode_result.get('error') == 'Not found':
return 'NOT_CONFIRMED', 'No geocoding result'
rank = geocode_result.get('rank', {})
# Retrieve ranks and fallback to 0 if not exists
confidence = rank.get('confidence', 0)
confidence_city = rank.get('confidence_city_level', 0)
confidence_street = rank.get('confidence_street_level', 0)
confidence_building = rank.get('confidence_building_level', 0)
if confidence >= min_confirmed:
return 'CONFIRMED', ''
elif confidence <= max_not_confirmed:
return 'NOT_CONFIRMED', ''
# Define reason for Partially Confirmed rank
for level, l_confidence in zip(['CITY', 'STREET', 'BUILDING'],
[confidence_city, confidence_street, confidence_building]):
if l_confidence == 0:
reason = f'{level}_NOT_CONFIRMED'
elif l_confidence <= max_not_confirmed:
reason = f'LOW_{level}_LEVEL_CONFIDENCE'
elif l_confidence <= min_confirmed:
reason = f'{level}_LEVEL_DOUBTS'
else:
continue
return 'PARTIALLY_CONFIRMED', reason
return 'PARTIALLY_CONFIRMED', 'Unknown'Analyzes a single geocoding result and determines the validation status based on confidence scores.
- If the geocoding failed or no result was found → returns
'NOT_CONFIRMED'and'No geocoding result'. - Extracts the following confidence values from
rank(if missing, falls back to0):rank.confidencerank.confidence_city_levelrank.confidence_street_levelrank.confidence_building_level
- Validation logic:
- If
confidence >= min_confirmed→ returns'CONFIRMED' - If
confidence <= max_not_confirmed→ returns'NOT_CONFIRMED' - Otherwise → returns
'PARTIALLY_CONFIRMED'and a reason, determined by the weakest of the city/street/building levels:- Level not present →
{LEVEL}_NOT_CONFIRMED - Level ≤ max threshold →
LOW_{LEVEL}_LEVEL_CONFIDENCE - Level between thresholds →
{LEVEL}_LEVEL_DOUBTS
- Level not present →
- If
If no specific reason is found, it falls back to 'Unknown'.
- Geoapify Geocoding API Documentation
- API Playground
- What is Address Validation?
- Create your free API key
This project is provided for demonstration purposes under the MIT License.
