Four ways to ship a dataset, four different jobs. Here's what each format is good at — and when to reach for it.
| Format | Best for | Strengths | Watch out for |
|---|---|---|---|
| CSV | Analysis, BI, universal exchange | Tiny, universal, streams easily | No types; commas/quotes need escaping |
| JSON | APIs, apps, NoSQL | Types, nesting, native to JS | Bigger files; flat tables feel verbose |
| Excel | Business users, ad-hoc analysis | Formulas, multiple sheets, formatting | Proprietary-ish; row limits; not for pipelines |
| SQL | Loading into a database | Schema + data in one file; ready to run | Verbose; dialect differences |
Comma-separated values is the lingua franca of data. Every spreadsheet, database, BI tool, and language reads it, files are small, and huge ones stream row by row. The trade-off: there are no data types (everything is text until something parses it), and values containing commas, quotes, or newlines must be quoted correctly — a frequent source of broken imports. Reach for CSV when you want maximum compatibility for analysis, BI dashboards, or moving data between tools.
JSON carries types (numbers, strings, booleans, null) and supports nested structures, and it's the native language of JavaScript and most web APIs. That makes it ideal for seeding apps, mocking API responses, and loading into document stores like MongoDB. The cost is size — keys repeat on every record — and for a simple flat table it's wordier than CSV. Reach for JSON when developers or APIs are the consumer.
An .xlsx file gives business users formulas, PivotTables, multiple sheets, and formatting in a familiar interface. It's the right deliverable when a human will open and explore the data directly. It's the wrong choice for automated pipelines (proprietary format, practical row limits, fragile to script against). Reach for Excel when handing data to analysts or stakeholders who live in spreadsheets.
A SQL dump bundles the schema and the data together: a CREATE TABLE with typed columns followed by INSERT statements. Run it and you have a populated table — no separate import step or column-typing guesswork. The downsides are verbosity and dialect differences (Postgres, MySQL, and others vary slightly). Reach for SQL when the destination is a database and you want it loaded in one step. The generators here infer sensible column types (integers, decimals, dates, varchars) when you export SQL.
Analyst or BI dashboard? CSV (or Excel if a person will open it). Developer, app, or API? JSON. Loading a database? SQL. When in doubt, CSV is the safe, universal choice — and you can always re-export the same dataset in another format here, since the data is reproducible from the seed.