Work: «Rebuilding Reliable Data Pipelines Through Modern Tools»
Author: Ted Malaska
Pages: 91
Edition: 1
Publisher: O’Reilly
Genre: Computer Science
Year: 2019
Language: English
Format: eBook (EPub)
File size: 5.6 MB
ISBN: 9781492058144
My work at Eva Health brought me professionally closer to the topic of modern data pipelines. I had already had an approach to the matter before during my doctoral research, but the truth is that my work stay at Banamex pigeonholed me and took a lot away from true advanced technological progress on data processing, and what I sensed I needed for my research it was appreciated more for the professional experience rather than specific knowledge. In other words, it was more of an empirical and intuitive question than anything else. However, my activity in this startup showed me that I was not so wrong on the subject and that I should deepen and formalize this knowledge.
This is a book that O’Reilly offers at no cost to download from their website. The book is marked as a «report» but I think it is still a good bibliographic reference. Its author describes a data operations framework and shows you the importance of testing and monitoring to plan, rebuild, automate, and then manage robust data pipelines—whether they’re in the cloud, on premises, or in a hybrid configuration.
The website from which the book is downloaded says many things about what one will learn from the book, but the truth is that this is only a compendium (good and well structured) of the accumulated experience of a practitioner of the subject but it is nothing more than a collection of tips and ideas. There are no recipes, there is no code. It is also true that if one already has some experience on the subject, it will be more appreciated what the book condenses into something that can become a reference to use frequently. For those who are new to the subject of data pipelines, without the proper guidance of someone with experience in the matter, it will only leave you with more questions than answers.
Un comentario en “«Rebuilding Reliable Data Pipelines Through Modern Tools»”