What a year of community

Hello from a delayed post. I’m going to write about my year of community in this post. I joined to some communities and related events over the past year. I prepared a chronologic list of groups and events which I am a member or participant of them below. Some of the events are local events that for Turkish speakers.

    1. 17th Jan 2015, I was a participant of We listen to women in IT event was organized by Google Anita Borg Scholarship Community.
    2. 23rd Jan 2015, I was officially a member of Kadın Yazılımcı (Women Techmakers). Kadın Yazılımcı is an environment that females share their notions and experiences about computing or not computing to encourage successors. I published 3 posts about python, 2 posts about algorithms and 1 post about my PyCon comments from the website of Kadın Yazılımcı in the last year.
    3. 15th Mar 2015, I was a participant of Women Techmakers Conference. Also, I was a booth attendance of Kadın Yazılımcı in this conference.
    4. 10th Apr 2015, I was a participant of PyCon 2015. I gained financial grant from the organization and did some volunteer works such as booth attendence on the PyLadies booth.pyladies
    5. 11th Jul 2015, I gave a presentation Text classification via scikit-learn in the PyIstanbul event. pyistanbul which is a group of Istanbul-based Python developers.
    6. 25th Jul 2015, I was a participant of PhpKonf and also I was one of the panelist of Kadın Yazılımcı panel in the conference.
    7. 13th Sep 2015, I was one of the organizer and one of the mentor of the first DjangoGirls Istanbul event. Also I was one of the proofreader of DjangoGirls tutorial Turkish translation project.
    8. 12th Dec 2015, I was one of the mentor&organizer&participant of DjangoGirls Istanbul. That was an amazing event! dsc_4825_nice_23413134480_o

I hope this year I write more often.

Homemade Turkish POS Tagger

As you see easily the rapid increase in the number of the online texts has also accelerated the studies carried out on information retrieval. Especially the content generated on social platforms within the online texts is further increasing day by day. Social online platforms have opened the way for a large number of texts from any language. Based on this progress, I decided to study authorship detection on Turkish texts. Unfortunately, authorship attribution on Turkish is far less than on English studies, so I forced into developing some basic tools. For example, I could not find suitable for POS tagger and have developed own tagger for Turkish using the Brill tagger.

Here my Turkish pos tagger code.

I read the train data from a treebank file. (METU-SABANCI)

I use nltk’s unigram, bigram and trigram taggers for back off tagger.

I apply 5-fold cross validation to my tagger and I get 90%-93% accuracy.

    sentence = "Uzun bir süre sonra kendime geldim ."
    decoded_sentence = sentence.decode('utf-8')
    tr_brill = TRTagger()
    print tr_brill.turkish_pos_tagger(decoded_sentence)
    uzun-Adj bir-Det süre-Noun sonra-Adv kendime-Pron geldim-Verb .-Punc

my PyCon notes

I was at PyCon and it was my first PyCon, so I’ll talk about PyCon right now. It is surely beyond doubt that, PyCon is a great event. Before my notes on speakings, I want to mention about financial grant of the organisation. I received a financial grant to cover my transoceanic travel expenses, yay! Also, I did two volunteer works during conference. First, I helped registration desk stuff. Second, I worked on the pyladies stand that I sold approximately 20 pyladies t-shirts, also I met great persons during my volunteering time.

Now, speakings can take to the stage. First day I sit in on especially machine learning related speakings. (Talking titles refer to pyvideo.org links, you can watch easily.)

Machine Learning 101 pandas, scikit-learn, gensim, Theano, continuum packages for machine learning
“Words, words, words”: Reading Shakespeare with Python text analysis, meta data, rhyme distribution (*it is a similar but light version of my authorship detection project)
Data Science in Advertising: Or a future when we love ads Real-Time Bidded (RTB) advertising, Click Through Rate (CTR) Prediction, Auto-Bidding systems, Traffic Prediction
Grids, Streets and Pipelines: Building a linguistic street map with scikit-learn geojson, hyperparameters, geopandas
How to interpret your own genome using (mostly) Python gemini, genome sequence
Losing your Loops: Fast Numerical Computing with NumPy aggregation functions, universal functions, broadcasting, and fancy indexing (*that is my favourite! it’s so clear, simple and useful)
How to build a brain with Python simulate the brain, Nengo, Spaun
Keynote – Guido van Rossum python 3, diversity
A Beginner’s Guide to Test-driven Development TDD
Cutting Off the Internet: Testing Applications that Use Requests requests,vcr, httpretty, mock, and betamax
Techniques for Debugging Hard Problems always read source, read all source
Finding Spammers & Scammers through Rate Tracking with Python & Redis velocity engine, keyspaces and facets

I should talk about poster session, I like clear and simple project. I saw a few clear&simple poster project and liked them, great jobs!

bonus, bonus, bonus: