129 AEC Datasets, Verified
Open research data for architecture, engineering & construction — floor plans, facades, 3D city models, energy data, codes, and procurement.
+ Submit a dataset3D BAG — 10M 3D Buildings (Netherlands)
💚 Open10 million automatically reconstructed 3D buildings covering the entire Netherlands in LoD1.2, LoD1.3, and LoD2.2.
3D Format Shootout — IFC Test Datasets
💚 OpenTest datasets for 3D file formats focusing on IFC, OBJ, FBX, GLTF, XKT, 3DS, MAX and other 3D formats for benchmark comparisons.
3DITA — Indian Temple Architecture Point Clouds
💚 Open325+ million points from 47 Nagara-style temple structures. India's first benchmark dataset for temple architecture semantic segmentation.
ADA Accessibility Standards
💚 OpenFull text of 2010 ADA Standards for Accessible Design and ADA/ABA Accessibility Guidelines from the U.S. Department of Justice.
AECBench — LLM Evaluation for AEC
💚 Open4,800 questions across 23 task types from ECADI + Tongji University covering building codes, design calculations, and construction management.
AECV-Bench
💚 OpenBenchmark for evaluating AI ability to process architectural floor plans. Text extraction ~95%, spatial reasoning 55-75%, symbol counting 39-51%.
AIA Contract Documents
🔒 Paid300+ industry-standard A-Series, B-Series, C/D/E/G-Series contract forms including A101, A201, B101. Most widely used US construction contracts.
ARCAT — 33,000+ Free Product Specs
💚 Open1,100+ manufacturers, 33,000+ building products organized by CSI MasterFormat with downloadable specs, AutoCAD, and Revit BIM families.
ArCH — Heritage Point Cloud Segmentation
💚 Open17 annotated heritage scenes (millions of labeled 3D points) plus 10 unlabeled scenes. 10 semantic classes, UNESCO World Heritage sites included.
ARCH2S — Exterior Architectural Structures
💚 OpenSemantically-enriched photo-realistic 3D architectural models for semantic segmentation of building exteriors. 4 building types from Hong Kong.
ArchCAD-400K
💚 Open413,062 chunks from 5,538 CAD drawings across 27 categories. Largest architectural CAD dataset for panoptic symbol spotting. NeurIPS 2025.
Archilyse — Architecture AI Training Data
💚 OpenAI training dataset for real estate/architecture with open-source AGPL-licensed floor plan to IFC annotation pipeline. Presented at ICCV Paris 2024.
Architect Salary Survey — 13K+ US Salaries
💚 Open13,000+ architect salary surveys across the United States with data visualization. Covers firm size, location, and career level.
Architectural Heritage Elements Image64
💚 OpenArchitectural heritage elements for classification at 64×64 resolution, organized by element class in folder-per-class structure.
Architectural Styles Dataset — 25 Classes
💚 Open10,113 images across 25 architectural styles from multiple time periods and mixed sources. Organized in 25 class folders.
ArchShapesNet — 44,000 BIM Element Samples
💚 Open44,000 BIM element samples (4,000 per class × 11 classes). First large-scale BIM element classification dataset from Seoul National University of Science and Technology.
Awesome CityGML — 210M+ Global 3D Buildings
💚 OpenCurated directory of 210+ million semantic 3D buildings from 21 countries and 65+ cities including Netherlands, Poland, Luxembourg, Germany, and Singapore.
Awesome Procurement Data
💚 OpenCurated list of US federal procurement data resources, APIs, and utilities including pysam, SamDotNet, and procurement-tools Python wrappers.
BatchPlan — Floor Plan Extraction from IFC
💚 OpenPython tool that extracts geometric data (CSV/WKT) from IFC models and generates 2D floor plans (PNG) for large-scale dataset creation.
BD3 — Building Defects Detection Dataset
💚 Open3,965 annotated RGB images from 50+ buildings (10–60 years old) covering 6 defect types: algae, major/minor crack, peeling, spalling, stain.
BIM Open Schema + DuckDB
💚 OpenOpen standard schema enabling sub-second data export from Revit models to DuckDB. Eliminates traditional BIM data bottlenecks.
BIMNet — Scan-to-BIM Benchmark
💚 Open116.5M points from 25 real-world scans, 382 rooms, 8,710 m². First openBIM-based scan-to-BIM dataset with IFC ground-truth models.
Brick by Brick 2024 — Building Data Classification
💚 OpenML competition ($20K AUD) for automating building data classification using Brick schema. Based on NeurIPS 2024 Building Timeseries Dataset.
BRIDGE — 13,000+ Floor Plans with Descriptions
💚 OpenLargest annotated floor plan dataset for Document Analysis. Aggregates ROBIN, SESYD, and web sources with bounding boxes and paragraph descriptions.
Building Facade Segmentation (Roboflow)
💚 Open598 images with 10 classes (balcony, car, facade, fence, shop, street, traffic, vegetation, window) in COCO JSON, Pascal VOC, and YOLOv8 formats.
BuildingsBench — 900K Building Energy Profiles
💚 Open53.6M measurements from 3,053 meters in 1,636 buildings over 2 years, plus 900K synthetic profiles from NREL. NeurIPS 2023.
CAADRIA 2026 — 71,334 Architectural Plans
💚 Open71,334 automated architectural plans in DrawScript vector format generated using shape grammar and visual programming. AI training dataset.
Caribou — OpenStreetMap for Grasshopper
💚 OpenFramework for processing large-scale OpenStreetMap urban data in Grasshopper/Rhino, enabling parametric urban modeling workflows.
CBE Post-Occupancy Survey
💚 OpenResearch data from Center for the Built Environment comparing WELL vs LEED building IEQ satisfaction scores. Free download.
City-Level Open Permit Data
💚 OpenBuilding permits from major US/global cities via open data portals: NYC, Chicago, Seattle, Los Angeles, San Francisco, Vancouver, Melbourne.
CMP Facade Database
💚 Open606 rectified facade images with manual pixel-level annotations across diverse architectural styles from the Czech Technical University.
CODE-ACCORD — Building Regulation NLP Dataset
💚 Open862 sentences, 4,297 entities, 4,329 relations from 33 regulatory documents (1,595 pages) annotated by 12 experts. England and Finland.
Construction Estimation Data
💚 OpenTabular construction cost and estimation records for training AI-driven cost estimation and bid analysis models.
Construction-PPE — Safety Equipment Detection
💚 OpenReal construction environment PPE compliance/non-compliance images in Ultralytics YOLO format for worker safety monitoring.
ConstructionSite-10k
💚 Open10,013 construction site images (7,009 train / 3,004 test) for testing vision-language models on construction inspection tasks.
Cool, Quiet City — Urban Comfort
💚 OpenSmartwatch data on residents' perceptions of urban noise and heat collected via the Cozie smartwatch platform. Kaggle ML competition.
CSB — Cracks in Steel Bridges
💚 OpenSteel bridge images with pixel-wise fatigue crack annotations from Rijkswaterstaat and ProRail, covering cracks, corrosion, and defect-free structures.
CubiCasa5K
💚 Open5,000 real-world floor plan images from Finnish real estate with 80+ annotation categories including rooms, doors, windows, walls, and stairs.
Data.gov — Building Code & Zoning Datasets
💚 OpenBuilding permits, zoning GIS data, development permits, property maintenance codes, land development codes, and state energy codes from the US government.
EC3 — Embodied Carbon in Construction
💚 Open150,000+ verified Environmental Product Declarations (EPDs). Largest open-access database for embodied carbon in building materials.
eCFR — Electronic Code of Federal Regulations
💚 OpenAll US federal regulations including HUD, ADA, OSHA, and fire codes in XML bulk format with continuous updates.
ECP Facade Database
💚 Open104 rectified Hausmannian building images from Paris with 7 semantic classes (wall, window, sky, shop, balcony, door, roof, chimney).
eTRIMS Image Database
💚 Open60 annotated non-rectified facade images with 4-class and 8-class annotation variants covering irregular patterns.
FloorPlanCAD
💚 Open15,663 real-world CAD floor plans with fine-grained annotations for 30 object categories (doors, windows, furniture, equipment). ICCV 2021.
Fragments — Open-Source BIM Format
💚 OpenPerformance-optimized binary BIM format rendering millions of objects with LOD. Reduces 2GB IFC files to ~200MB with geometry deduplication.
Free Construction Contract Templates
💚 OpenBasic construction contract templates (lump sum, cost-plus, time & materials, design-build) from TemplateLab, Mastt, Jotform, and eForms.
Global Procurement Dataset — 72M Contracts
💚 Open72 million contracts from 42 countries (2006–2021) with buyer/supplier info, geolocation, product classification, price, and corruption risk indicators.
GlobalBuildingAtlas
💚 Open2.75 billion 3D building structures at 3-meter resolution derived from ~800,000 satellite images. Global coverage of footprints, heights, and LoD1 models.
GraphRAG for Smart Buildings
💚 OpenLLM + Knowledge Graph pipeline for smart building management using IFC data parsed into Neo4j graphs with ifcopenshell.
GYU-DET — Multi-Defect Bridge Dataset
💚 Open11,123 high-resolution bridge images with 6 defect types (cracks, spalling, seepage, honeycomb surface, exposed rebar, holes). Nature 2025.
Heritage Building Defect Detection
💚 OpenDefect detection dataset for heritage buildings with bounding box and segmentation mask annotations for inspection applications.
Heritage Images & Point Clouds (Zenodo)
💚 OpenPhotogrammetric images and annotated point clouds for 5 heritage buildings following ArCH classification standards.
HUD SOCDS — US Building Permits by Metro Area
💚 OpenBuilding permit data for all US Metropolitan Areas, Central Cities, and Suburbs from the U.S. Dept. of Housing and Urban Development.
Hybrid-CGAN — Synthetic Building Fault Data
💚 OpenSynthetic fault data for building fault detection using EnergyPlus simulation + GANs. 50% improvement in FID scores, classifier accuracy 0.82 → 0.94.
Hypergraph Floor Plans
💚 OpenNature Communications framework for automated floor plan generation using hypergraphs. Open-source Python code with graph data structures.
HZNU Facade Dataset
💚 Open624 high-resolution buildings with 43,277 annotated windows (avg 4056×3856 px) from Hangzhou, China. Includes homography matrix annotations.
ICC Digital Codes — Full US Building Codes
💚 OpenFull text of IBC 2024/2021, IRC, IFC (Fire), IMC, IPC, IECC and more. Searchable online free; REST API via Code Connect (paid).
IDSedit — Visual IDS Editor
💚 OpenNode-based visual editor for Information Delivery Specifications (OpenBIM). Drag-and-drop IDS rule creation without XML coding.
IFC BIM QA Dataset
💚 Open13,485 question-answer pairs covering BIM and IFC domain knowledge for training/testing LLMs on BIM-specific queries.
IFC ML Classification — 3D Object Recognition
💚 OpenML pipelines for automated 3D IFC object classification achieving ~98% accuracy using GNN, Stacked RF, and Deep Learning.
IFC Model Checker — Open-Source QA
💚 OpenBrowser-based IFC validation using IDS standard and IfcOpenShell. Validates IFC models against Information Delivery Specifications.
IFC-Bench — LLM-based IFC QA Benchmark
💚 Open21 IFC model projects with 1,027 QA pairs for testing LLMs on natural language queries to IFC building information retrieval.
ifc2duckdb — IFC to SQL Database
💚 OpenConverts IFC BIM files to DuckDB for high-performance sub-second SQL queries on building data. Open-source hackathon project.
IfcLCA — Embodied Carbon from BIM
💚 OpenFree, open-source browser-based embodied carbon calculator. Converts IFC models to CO2-equivalent data using Swiss national carbon databases.
IFCNet — 19,000 BIM Entity Models
💚 Open~19,000 CAD models across 65 IFC classes (IFCNetCore: 7,930 objects, 20 classes) extracted from ~1,000 IFC models for BIM entity classification.
Ireland Planning Database
💚 OpenNational Planning Application Database with spatial and tabular data. Dublin City Council applications from 2003 to present.
IRFs — Irregular Facades
💚 Open1,057 high-quality facade images from 104 countries (1895–2023) with 6 segmentation classes: Background, Plant, Wall, Window, Door, Fence.
KAAN Dataset
💚 Open800+ Dutch apartment IFC models from real housing projects with annotated floor plans (WKT), material data, and spatial graphs.
LabelMeFacade
💚 OpenHighly irregular and diverse building facade images from the LabelMe segmentation dataset, challenging standard segmentation approaches.
LCAx — Lifecycle Assessment Data Exchange
💚 OpenOpen standard for exchanging LCA results and EPD data across software platforms, born from Denmark's 2023 building regulation on LCA submissions.
LLM-Knowledge-Pool-RAG
💚 OpenLearning resource and implementation guide for architects building RAG (Retrieval-Augmented Generation) systems with vector databases.
LUMO Benchmark — Outdoor Vibration Monitoring
💚 Open9m lattice mast structure with 18 reversible damage cases across 6 levels. Accelerometers, strain gauges, and temperature sensors for SHM research.
MBDD2025 — Building Surface Defects (UAV)
💚 Open14,471 UAV-collected building images across 6 structure types (steel, concrete, wood, brick) with defects including cracks, leakage, and corrosion.
MIT Places365 (Building Subset)
💚 Open1.8M images (Standard) to 8M (Challenge) across 365 scene categories including many building-related categories. Download available on Kaggle and official.
MLSTRUCT-FP — 954 Chilean Floor Plans
💚 Open954 high-resolution multi-unit residential floor plans from 165 Chilean projects with wall and slab polygon annotations.
Modern Architecture — 100K Images
💚 Open~100,000 modern architecture building photographs on Kaggle. No labels or annotations — raw photography dataset.
Modified Swiss Dwellings
💚 Open~16,800 annotated apartment-level floor plans with rich graph annotations covering windows, doors, orientation, and 22 room subtypes.
MSD — Floor Plan Generation Benchmark
💚 OpenBenchmark dataset for floor plan generation of building complexes, published at ECCV 2024. Available on Kaggle, GitHub, and arXiv.
National Zoning Atlas — US Zoning Data
💚 OpenStandardized zoning data covering 200+ regulatory characteristics across the US. Most comprehensive national zoning dataset.
NFPA Free Access — 300+ Fire Codes
💚 OpenFull text of NFPA 101 (Life Safety), NFPA 72 (Fire Alarm), NFPA 13 (Sprinkler), and 300+ fire codes via NFPA LiNK.
OCDS Registry — Global Public Procurement
💚 Open100+ country procurement datasets in standardized Open Contracting Data Standard covering planning, tender, award, contract, and implementation stages.
Open City Model — 125M 3D Buildings (USA)
💚 Open~125 million 3D building geometries across the entire US derived from USBuildingFootprints, available on AWS S3.
OpenBIM MCP Server — AI + BIM Integration
💚 OpenModel Context Protocol server connecting AI assistants to OpenBIM data. Enables AI to query, reason, and answer questions about IFC models.
OpenBIMtoFEM — BIM to Structural Analysis
💚 OpenFramework converting OpenBIM IFC models to Finite Element Method analysis meshes. Python-based IFC to FEM pipeline.
OpenConstruction-Datasets — 51+ Dataset Catalog
💚 OpenSystematic catalog of 51+ open-access visual datasets for construction AI covering safety, quality, progress, equipment across multiple modalities.
OSArch Example Files — Open BIM Samples
💚 OpenCurated collection of high-quality BIM files in IFC2X3, IFC4, and IFC4x3 covering architectural, structural, and MEP models.
Pix2Pix Facades Dataset
💚 Open400 facade photographs paired with hand-labeled semantic maps (walls, windows, doors). The classic pix2pix training dataset.
Planning London Datahub
💚 OpenLive planning application and development proposal data from all London boroughs, updated daily by the Greater London Authority.
planning.data.gov.uk — UK Planning Applications
💚 OpenNational planning and housing data from all English Local Planning Authorities with map view, search, and download. CSV, JSON, GeoJSON.
Purdue PTBC — Building Code NLP Corpus
💚 OpenPart-of-speech tagged building code corpus used to train Bi-LSTM RNN with BERT for building code NLP. 95.11% precision POS tagger.
PythonForIFC — 12+ BIM Scripts
💚 Open12+ Python utilities for IFC file manipulation and BIM workflow automation. Open-source scripts for common IFC operations.
QTOpro — IFC Quantity Takeoff Tool
💚 OpenBrowser-based IFC analysis for construction quantity takeoff. No uploads or installation required. Extracts quantities directly from IFC for cost estimation.
ResBIM — Synthetic BIM Pairs
💚 Open1,000+ paired samples each containing a parametric 3D Revit BIM model and annotated 2D floor plan, for BIM automation research.
ResPlan — 17,000 Residential Floor Plans
💚 Open17,000 vector-graph floor plans with precise wall, door, window, and functional space annotations. Graph representations with NetworkX compatibility.
RFPDB.com — Architecture RFP Database
💚 OpenGovernment, for-profit, and non-profit RFPs with a dedicated architecture category. Free listings with no subscription required.
ROBIN — 510+ B&W Floor Plans
💚 Open510 black-and-white architectural floor plans plus 122 scanned documents (ROBIN++) for document analysis and automatic retrieval.
RooFormer — 3D Roof Reconstruction
💚 OpenDeep learning model reconstructing 3D roof models from high-resolution aerial/satellite imagery for automatic LoD2 building generation.
RPLAN — 80,788 Asian Residential Floor Plans
🔶 Request80,788 densely annotated floor plans from real Asian residential buildings with room types, wall distinctions, and door types. Used by HouseGAN.
rrustom/architecture2022clean
💚 OpenArchitecture image dataset (2022) on HuggingFace with Parquet metadata format.
S3DIS — Stanford Indoor Spaces
🔶 Request6 areas, 3 buildings, ~270 rooms with 13 semantic categories per point (ceiling, floor, wall, beam, column, window, door, furniture).
SAM.gov — US Federal Contract Opportunities
💚 OpenAll US federal government contract opportunities (RFPs, solicitations, awards) for all agencies. REST API, CSV bulk downloads, PostgreSQL snapshots.
SDNET2018 — 56,000 Concrete Crack Images
💚 Open56,000+ images (256×256 px) of cracked and non-cracked concrete from bridge decks, walls, and pavements. Crack widths 0.06mm–25mm.
SESYD — 1,000 Synthetic Vector Floor Plans
💚 Open1,000 synthetic floor plans (16 architectural symbol models). Extended SFPI version: 10,000 images with ~300,000 furniture items.
SF Building Permits — 200K+ Records
💚 Open200,000+ permits (5 years) to 1.1M+ records (1980-2019) with estimated cost, description, permit type, location. Weekly updates from DataSF.
Shovels.ai — 180M+ US Building Permits
🔒 Paid180M+ AI-enriched permits across 30M US addresses with inspection pass rates, contractor profiles, and 85% US population coverage.
SODA — 20K+ Construction Site Images
💚 Open20,000+ images with 15 object classes covering workers, materials, machines, and layouts from multiple construction sites across conditions.
SpatialLM — 3D LLM for Point Clouds
💚 OpenNovel 3D large language model for processing point cloud data with architectural spatial understanding and natural language output.
SPECS — Streetscape Perception Dataset
💚 OpenDemographically balanced global urban visual perception dataset from a 1,000-person survey across multiple cities. Published in Nature Cities.
StructuralCodes — Engineering Calculations
💚 OpenPython library for structural engineering calculations following Eurocode and fib Model Code standards.
SYNBUILD-3D
💚 Open6.2 million synthetic residential buildings at LOD-4 with 3D wireframe graphs, floor plan images, and LiDAR-like roof point clouds.
Synthetic Floor Plans (Figshare)
💚 Open2,500 synthetic single-family floor plans from T0 (2 rooms) to T4 (10 rooms) typologies, in black-and-white and color-coded versions.
TBBR — Thermal Bridges on Building Rooftops
💚 Open926 annotated images (68.5 GB) from 6 UAV flights with 5 channels (RGB + thermographic + height). 6,927 thermal bridge annotations.
TED — EU Public Procurement (TED)
💚 OpenALL EU public procurement notices since 1993 — the world's largest procurement dataset. REST API, XML bulk downloads, CSV subsets.
terminusresearch/photo-architecture
💚 OpenHigh-resolution building and unique architecture images on HuggingFace with Parquet metadata. Full-resolution architectural photography.
TopologicPy — 3D Geometric Modeling
💚 OpenOpen-source 3D topological modeling library integrated with Fragments for BIM. Python-based 3D topology data with IFC integration.
TUM-FACADE — Point Cloud Benchmark
💚 Open33 annotated building facades with ~333 million annotated LiDAR points. Semantic segmentation: windows, doors, balconies, moldings.
UAVID3D — UAV Building Reconstruction
💚 Open21GB of UAV RGB and thermal imagery for 3D building reconstruction and thermal anomaly detection, including 3D meshes.
UK Contracts Finder
💚 OpenUK government procurement opportunities and awarded contracts in OCDS format. API and CSV bulk downloads via data.gov.uk.
Uniclass 2015 — UK Construction Classification
💚 OpenUnified classification system for UK construction covering products, systems, activities, and spaces. Used for BIM and specification classification.
UpCodes — US Building Codes (80+ Jurisdictions)
💚 Open80+ US jurisdictions, 190K+ local amendments, 6M+ code sections with AI Copilot for code research.
USAspending.gov — US Federal Spending
💚 OpenAll US federal spending on contracts, grants, and awards back to FY2001. Filterable by NAICS codes (23xxxx for construction). REST API and CSV.
VoxCity — 3D City Model Generator
💚 OpenOpen-source Python package for automated 3D city model generation from OpenStreetMap. Outputs voxel grids, GeoJSON, building heights, and vegetation data.
WikiChurches — Fine-Grained Architectural Styles
💚 Open9,485 church images with 631 bounding box annotations and architectural style labels from Wikipedia. NeurIPS 2021 Datasets & Benchmarks.
World Bank Contract Awards
💚 OpenContract awards for World Bank-funded IDA/IBRD projects worldwide with Global Public Procurement Database on country procurement systems.
xBD — Satellite Building Damage Assessment
🔶 Request5,598 satellite image pairs (1024×1024) across 11 disaster events with 4-level damage labels covering hurricanes, wildfires, floods, and earthquakes.
YOLOv11 Construction Monitoring
💚 OpenConstruction site monitoring combining YOLOv11 object detection with Autodesk ACC for automated progress tracking and safety monitoring.
Z24 Bridge — SHM Benchmark
💚 Open1 year of continuous accelerometer monitoring from Z24 highway bridge with progressive controlled damage. The most popular SHM benchmark.
ZAHA — Large-Scale Point Cloud Facades
💚 Open601 million annotated points in 5 and 15 class variants for large-scale facade semantic segmentation. Published at WACV 2025.
ZInD — 1,524 Homes with Floor Plans
🔶 Request71,474 panoramas from 1,524 real unfurnished US homes with annotated 2D/3D floor plans from 20 US cities. CVPR 2021.
Every project smarter than before.
Own your data.
Let's find the AI strategy that works for you.
Book Free Discovery Call →Spotted an error? Suggest a correction →