HCA, Hospital Corporation of America Big Data Developer Lead in Nashville, Tennessee

The role requires working closely with others, frequently in a matrixed environment, and with little supervision. As a consulting-level position the role requires ‘self-starters’ who are proficient in problem solving and capable of bringing clarity to complex situations. It requires contributing to strategic technical direction and system architecture approaches for individual projects and platform migrations. The culture of the organization places an emphasis on teamwork, so social and interpersonal skills are equally important as technical capability. Due to the emerging and fast-evolving nature of Big Data technology and practice, the position requires that one stay well-informed of technological advancements and be proficient at putting new innovations into effective practice.

This role will provide leadership and deep technical expertise in all aspects of solution design and application development for specific business environments. Focus on setting technical direction on groups of applications and similar technologies as well as taking responsibility for technically robust solutions encompassing all business, architecture, and technology constraints. * Responsible for building and supporting a Hadoop-based ecosystem designed for enterprise-wide analysis of structured, semi-structured, and unstructured data.

  • Manage and optimize Hadoop/Spark clusters, which may include many large HBase instances

  • Support regular requests to move data from one cluster to another

  • Manage production support teams to make sure service levels are maintained and any interruption is resolved in a timely fashion

  • Bring new data sources into HDFS, transform and load to databases.

  • Work collaboratively with Data Scientists and business and IT leaders throughout the company to understand Big Data needs and use cases.


    A successful candidate will have:

  • Bachelor’s degree in Computer Science, or related discipline; with at least 7 years of equivalent work experience

  • Data modeling experience using Big Data Technologies.

  • Strong understanding of best practices and standards for Hadoop application design and implementation.

  • 2 Years of hands-on experience with Cloudera Distributed Hadoop (CDH) and experience with many of the following components: o Hadoop, MapReduce, Spark, Impala, Hive, Solr, YARN

    o HBase or Cassandra

    o Kafka, Flume, Storm, Zookeeper

    o Java, Python, or Scala

    o SQL, JSON, XML

    o RegEx

    o Sqoop

  • Experience with Unstructured Data

  • Experience in developing MapReduce programs using Apache Hadoop for working with Big Data.

  • Experience having deployed Big Data Technologies to Production.

  • Understanding of Lambda Design Architectures and Real-Time Streaming

  • Ability to multitask and to balance competing priorities.

  • Requires strong practical experience in agile application development, file systems management, and DevOps discipline and practice using short-cycle iterations to deliver continuous business value.

  • Expertise in planning, implementing, supporting, and tuning Hadoop ecosystem environments using a variety of tools and techniques.

  • Knowledge of all facets of Hadoop ecosystem development including ideation, design, implementation, tuning, and operational support.

  • Ability to define and utilize best practice techniques and to impose order in a fast-changing environment. Must have strong problem-solving skills.

  • Strong verbal, written, and interpersonal skills, including a desire to work within a highly-matrixed, team-oriented environment. Preferred

    A successful candidate may have:

  • Experience in Healthcare Domain

  • Experience in Patient Data

  • Experience with Predictive Models

  • Experience with Natural Language Processing (NLP)

  • Experience with Social Media Data Hardware/Operating Systems:

  • Linux

  • UNIX

  • Distributed, highly-scalable processing environments

  • Networking - basic understanding of networking with respect to distributed server and file systems connectivity and troubleshooting of connectivity errors Databases :

  • RDBMS – Teradata

  • NoSQL, Hbase, Cassandra, MongoDB, In-memory, Columnar, other emerging technologies

  • Other Languages – Java, Python, Scala, R

  • Build Systems – Maven, Ant

  • Source Control Systems – Git, Mercurial

  • Continuous Integration Systems – Jenkins or Bamboo

  • Config/Orchestration – Zookeeper, Puppet, Salt, Ansible, Chef, Oozie, Pig

  • Ability to integrate tools outside of the core Hadoop ecosystem Certifications (a plus, but not required):

  • CCDH (Cloudera Certified Developer for Apache Hadoop)

Title: Big Data Developer Lead

Location: Tennessee-Nashville-Corporate Main Campus

Requisition ID: 10207-18554