General

datasciY.com

Introduction

My blogs on more general subjects will be posted here.

Good Habits, Bad Habits - Ngram

Google Ngrams chart showing "habit, goal, and evaluation" word frequencies in English books over time. The chart shows that English language books wrote about "habit" more often than "goal" or "evaluation" during earlier years, from 1890 to about 1955. "Habit" was less popular after 1955, but it was becoming sharply more popular from 2007 to 2009. This book is about habits and how to form good habits we wish for.

Google Ngram is a type of machine learning task that can be performed on scanned text documents. It can be used to find which words are becoming more popular, as shown here. Or it can be used to study which words are frequently used together or nearby. The multi-word Ngram uses k-nearest neighbor type of machine learning model to compute distances across word combinations. Google freely makes the full data of scanned books available on its website for further evaluation using any other user desired software & tools.

English Ngram Chart

Habbit was highest in 1890 .005%, then declined to .002%, and increasing in final year 2008. Goal and Evaluation started from zero-.001% to .007%-.008% during 1975-2000, then declined sharply in 2008 to .005%-.006%.

1890, William James, The Principles of Psychology. "The more of the details of our daily life we can hand over to the effortless custody of automatism, the more our higher powers of mind will be set free for their own proper work."

French Ngram Chart

The word "habit" is hightest among the three words, but the levels are lower than English for all years, .0012% is high level. Need to test this again using French words to see if the relationship among words hold.