I used the combined tf-idf vectorizer of both word and character bigrams as feature in an authorship detection example.
First of all, what is n-gram ? N-gram is defined that a adjacent sequence of n items from a given sequence of text or speech, in which the n should be an integer greater than zero. Language models take advantage of the ordering of words, are called n-gram language models. N-grams models can be envisioned sliding a small window which is only n words are visible at the same time on the given text. The simplest n-gram model is unigram model which n is one. That means the window shows only one word at a time. The more complicated models when n is two is called bigram or n is three is called trigram are commonly more informative than unigram.
The authorship detection example used Reuter_50_50 dataset as train and test data.  Reuter_50_50 dataset contains 50 authors and 50 texts belong each author. As a beginning the example gets only eight authors texts. These authors are
Here are the results;
n_samples: 400, n_features: 32885 for both train and test data.
Table is shown Precision, Recall and F1-Score of L1 penalty SVC and L2 penalty SVC for each eight authors.