Data Scientists

Sin duda fruto de la serendipia han venido a entremezclarse varios caminos que me han conducido a una orilla en la que me encuentro más cómodo que con la ya un tanto desajustada definición de «minero de datos». Esta nueva orilla es la de un «Data Scientist».

¿Qué es un «Data Scientist«?

El primero de los caminos en esta guía para  iniciados:

Beautiful Data

At Facebook, we felt that traditional titles such as Business Analyst, Statistician, Engineer and Research Scientist didn’t quite capture what we were after for  our team. The work-load for the role was diverse: on any given day, a team member could author a multistage processing pipeline in Python, design a hypothesis test, perform a regressión analysis over data samples with R, design and implement an algorithm for some data-intensive product or sevice in Hadoop, or communicate the results of our analyses to other members  of the organization in a clear and concise fashion. To capture the skill set requited to perform this multitue of tasks, we created the role of «Data  Scientist».  Information platforms and the Rise of the Data Scientist. Jeff Hammerbacher.

Otro de los caminos, conducen a este puerto:

  • Learn about matrix factorizations
  • Start learning statistics by coding with R.
  • Learn about distributed systems and databases.
  • Learn about machine learning.
  • Learn about least-squares estimation and Kalman filters.
  • Study Engineering.

Y este otro de título sugerente, a este otro más específico:

  • Obtain: pointing and clicking does not scale.
  • Scrub: the world is a messy place
  • Explore: You can see a lot by looking
  • Models: always bad, sometimes ugly
  • iNterpret: “The purpose of computing is insight, not numbers.”

No está mal empezar con un cambio de identidad.