Adam Kawa

Big Data Developer
Warszawa, mazowieckie

Umiejętności

Apache Pig Bash Common Lisp CVS Git Grails Hadoop HBase Hive Java Java Enterprise Edition MapReduce Maven Oozie Perl SQL Subversion Pig

Języki

polski
ojczysty
angielski
biegły
hiszpański
podstawowy
rosyjski
podstawowy

Doświadczenie zawodowe

GetInData
Founder & Big Data Developer
■ Installation, administration and security of Hadoop cluster - Hortonworks (HDP, Ambari), Cloudera (CDH, Cloudera Manager), Kerberos, Sentry
■ Large-scale log collection and delivery - Kafka, Camus
■ Large-scale ETL processes - Spark, Hive, Pig, Sqoop, Oozie, Luigi, HCatalog, Falcon
■ Batch processing and analysis - Spark, Spark SQL, Hive, Pig, Tez
■ Real-time stream processing - Flink, Spark Streaming, Storm
■ Real-time random read-write requests (NoSQL datastores) - Cassandra, HBase
■ Low latency analytics, search and BI backends - Spark Streaming, Storm, Elasticsearch, Solr, Impala
Logo
Blogger
Hakuna MapData!
■ Writing blog posts about big data that contains some bytes of humor
■ 23 blog posts and presentations about various topics related to Hadoop and BigData including: Apache Pig, Apache HBase, Java MapReduce API, YARN, NameNode HA, HDFS Federation, Amazon Elastic MapReduce, Ganglia, operating and troubleshooting large Apache Hadoop clusters.
■ Blog posts re-published by: Cloudera, IBM developerWorks, DZone
Spotify
Data Engineer in Analytics and Data Infrastructure, Spotify
Building and operating one of the largest European Hadoop-YARN cluster.

■ Operating and troubleshooting 590-node YARN Hadoop cluster
■ Operating and troubleshooting 330-node MRv1 Hadoop cluster
■ Implementation of MapReduce jobs (Java MapReduce, Python Streaming, Pig, Hive, Luigi, Avro, Sqoop)

The most important tasks:
■ stabilizing the cluster after growing fast from 60 to 190 nodes
■ growing the cluster to 690 nodes
■ migration to HDP2 and YARN
■ deployment of NameNode HA, Capacity Scheduler and multiple tools from Apache Hadoop ecosystem
■ building testing clusters in the cloud
■ cluster capacity planning before purchasing 500 nodes
Logo
Hadoop Instructor
Compendium Educational Center (Authorized Cloudera Training Partner)
■ Delivering official Cloudera training courses (Hadoop Developer, Hadoop Administrator, HBase, Pig+Hive+Impala, Cloudera Essentials and custom courses).
Logo
Hadoop Developer
ICM Warsaw University
■ Implementation of various algorithms to analyze content of a large collection of documents (academic papers) using Apache Hadoop stack (Java MapReduce, HBase, Pig, Oozie)
■ Installation, configuration and administration of Hadoop and HBase clusters
IBM Polska
Software Developer
Research in the area of Apache Hadoop distributions and appliances
Logo
Software Developer
Netezza Polska (an IBM Company), Warsaw, Poland
INZA Hadoop: Design, implementation, testing and documentation of the functionality of MapReduce paradigm on Netezza Performance Server (Apache Hadoop, MapReduce, Java SE 6, SQL)

Netezza Spatial Toolkit: Design and implementation of its core functionality, spatial and geospatial analysis functions (C/C++, SQL)

ARS Real: Implementation of a module to analyse customers shopping habits for Analytical and Reporting System for REAL, (T-SQL, large structured datasets, SQL Server Reporting Services)
Citibank International plc
Summer analyst
Designing and supervision of creation of a part of an internal asset management system
Motorola
Summer intern (Software Developer)
Implementing a set of libraries to support a test framework for call processing applications (C/C++, Perl)

Szkolenia i kursy

Cloudera Administrator Training for Apache Hadoop, Berlin, 2011
Cloudera Training for Apache HBase, New York, 2010
Cloudera Developer Training for Apache Hadoop, New York, 2010

Cloudera Certified Hadoop Administrator, 2011
Sun Certified Programmer for Java SE 6 (93% score), 2011
Cloudera Certified Hadoop Developer, 2010

Edukacja

Logo
Finanse i Rachunkowość, licencjackie
Szkoła Główna Handlowa w Warszawie
Logo
Informatyka, magisterskie
Uniwersytet Warszawski

Specjalizacje

Bankowość
Analiza/Ryzyko

Zainteresowania

Travels, Sport (football, tennis, skiing)

Inne

Awards and Scholarships
* 2008 – 2009: Academic scholarship at Warsaw University
* 2004: Award for the best graduate from United Nations Organization High School in Bilgoraj
* 1997 – 1998: Scholarships from Bilgoraj's City Mayor for exceptional achievements

Presentations
* 2008: Presentation of Groovy on Grails, a web framework, at Warszawa Java User Group
* 2007: Presentation of my B.Sc. project, an OpenOffice Calc plugin (which allows to carry out bioinformatics experiments), at OpenOffice.org Conference 2007 in Barcelona

Academic-related interests
* Distrubuted computing of large data sets
** Hadoop (HDFS, MapReduce)
* Artificial intelligence
** Data mining, machine learning, recommender systems
* Intermarket technical analysis

Computer Skills (Programming languages, operating systems, other technologies)
Apache Hadoop, HDFS, MapReduce, Apache Pig, Apache Hive, Apache HBase
* Java SE, J2EE, Groovy on Grails
* HTML, PHP, Symfony, Yii
* T-SQL, PL/SQL, SQL, MySQL, NzSQL, ODRA
* Assembly, C/C++, Common Lisp, Dylan, Ocaml, Pascal, Prolog, Smalltalk, XML
* Perl, Bash Scripts
* CVS, SVN, Maven, AccuRev

Grupy

2 wszystkich wypowiedzi
0 plusów
Uniwersytet Warszawski
Uniwersytet Warszawski
Uniwersytet Warszawski, założony w 1816 roku, jest największą polską uczelnią i jednocześnie jedną z najlepszych w kraju.
Hadoop
Hadoop
Grupa poświęcona projektom z rodziny Apache Hadoop. Więcej na http://hadoop.apache.org/
Szkoła Główna Handlowa w Warszawie
Szkoła Główna Handlowa w Warszawie
Szkoła Główna Handlowa w Warszawie