Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

another reason why you provide data only over API - don't reach into my tables and lock me into an implementation.


An approach I like better than "only access my data via API" is this:

The team that maintains the service is also responsible for how that service is represented in the data warehouse.

The data warehouse tables - effectively denormalized copies of the data that the service stores - are treated as another API contract - they are clearly documented and tested as such.

If the team refactors, they also update the scripts that populate the data warehouse.

If that results in specific columns etc becoming invalid they document that in their release notes, and ideally notify other affected teams.


This same thing can be applied to contracts when firing events, etc. I point people to https://engineering.linkedin.com/distributed-systems/log-wha... and use the same approach to ownership.


Yeah, having a documented stream of published events in Kafka is a similar API contract the team can be responsible for - it might even double as the channel through which the data warehouse is populated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: