GDP : One way to calculate GDP is to count all the expenses incurred by people, businesses and Government. Here we look at the component of GDP pertaining to expenses incurred by consumers.

The bea.gov website has all the details of GDP. If we look at PCE (Personal Consumption Expenditure) for 2019, we see $14.8T  as total money spent by Consumers.

https://apps.bea.gov/iTable/iTable.cfm?ReqID=19&step=2#reqid=19&step=2&isuri=1&1921=underlying

We will divide consumer expenses in 2 parts => Retail sales and Services sales:

1. Retail sales:

It includes all durable and non durable goods sales

Retail Sales: A big component of GDP is Retail Sales.

This link describes Retail sales:

https://www.thebalance.com/what-is-retail-sales-3305722

So, you can think of retail sales as anything end consumers buy from a store as well as online including services as hair dressing, hotels, restaurants, bars, etc.

This link from USA census website provides details of retail sales starting from 1992.

https://www.census.gov/retail/marts/www/timeseries.html

This link shows Total Retail sales including everything (Stores selling physical goods (physical/online), gasoline, restaurants, Auto).

https://www.census.gov/retail/marts/www/adv44x72.txt

2019 Retail Categories: For 2019, Total Retail sales were $6.2T. There are 13 retail categories. Auto turns out to be biggest component of Retail sales. Following are few important categories for 2019.

  1. Auto: For 2019, Auto sales were about $1.1T. This includes Auto services also. About 17M passenger vehicles got sold in USA in 2019, with an avg price of $35K. So, that would imply 17M*$35K = $600B in new car sales. Then there are used car sales, and auto parts sales and auto services, that probably account for remaining other $0.5T ($400B in used car sales + $100B in repairs as per this link: https://carsurance.net/blog/automotive-industry-statistics/ )
  2. Online or mail only business (non-store): These were about $0.8T in 2019, a 10 fold rise from $0.08T in 1992. Online sales will pretty soon exceed Auto sales.
  3. Food services, drinking places: In 2019, this totaled $0.8T
  4. Grocery stores: About $0.7T for 2019
  5. Gas: For 2019, Gas sales were about $0.5T. Assuming gas prices of $2/gallon, that implies sale of 0.25T gallons of gas used. With 200M vehicles on road, and each vehicle averaging 12K miles/year and let's assume 20miles/gallon as car gas consumption, that implies 12K/20 = 0.6K gallons used every year (or about 2 gallons a day which seems reasonable). So, total gasoline per year is easily 200M*0.6 = 120B gallons of gasoline consumed every year. Of course, a lot of these vehicles burn a lot more than 2 gallon a day (looks more like 4 gallons a day), since we are off by a factor of 2.

 If you look at sales data for 1992, retail sales were about $2T. So, in about 27 years, retails sales more than tripled to $6.2T.  What's confounding is that retail sales go up by 4% - 5% every year, even though people's wages are going up by 2% a year (as per IRS tax returns). So, where do people get extra money to keep spending beyond their wage increase, year after year? Maybe it's the extra debt they take to keep spending more than what they earn. But then, the interest will start eating into their income, where they can't afford to take on any more debt. Not sure what's going on. We need to find that?

2. Services Sales:

Bea website shows total sales as $14.8T. "goods" consumption at $4.6T, but Auto sales show as $0.5T (not sure why, since retail sales show auto consumption at $1.1T). Then services consumption is at $10.2T. Part of these services (about $1.6T) was included in retail sales number above. So, we are left with about $8.6T in services that's not part of retail sales.

The main components of this services consumption of $8.6T is as below:

  1. Housing rental cost: This, combined with utility bills, is the biggest component of services. For 2019, total rental expenditure was $2.3T. Rental was $0.6T, while rental equivalent of owner occupied housing was $1.7T. Since 40M households are renting, and assuming avg rental of $15K/year => Total rental = 40M*$15K = $0.6T. Also with 70M homeowners, and assuming $2K/month rental equivalent, we get about 70M*$2K*12=$1.7T. NOTE: this is rental equivalent, the actual cost of home owning is higher since property taxes, interest, insurance, hoa dues can easily amount to $3K/month (with avg home price of $400K, and interest of 3%, property tax of 1%, HOA+insurance of 1% and 3% principal payment, it's easily 8% of home price). NOTE: it's hard to find home owner's cost of owning house, since most of the people have houses that appreciated in price, rather than them buying houses at these high prices. In order to find total expenses for housing, we have to add all housing interest income for all US banks, then add total property tax collection for all houses in USA, then add home insurance income for all insurance companies in USA. We can't include principal payments of mortgages for used houses, since they show up as expense for buyer, but show up as -ve expense for seller (i.e income for seller), so it's just trading of houses between buyer and seller. We have to include revenue from sale of new houses, since these are are the ones that got added to economy. We have to include price appreciation of used houses though. FIXME? Not sure how to include these? FIXME
  2. Housing Utilities: These expenses are around $0.35T with Electricity=$0.2T, Water=$0.1T and Gas=$0.05T
  3. HealthCare: These amounted to $2.5T for doctors visits, hospital bills, nursing home, etc. This is the 2nd biggest component of services. This doesn't include your health premium, as that's included in separate "insurance" category below.
  4. Financial services: These amounted to $0.75T. These include bank fees, brokerage commissions, trading fees, mutual fund charges, etc
  5. Insurance services: These were about $0.45T. These include health insurance, life insurance, auto insurance and home insurance.
  6. Personal Communication services: These add to about $0.35T.  These include Cable/Satellite services ($0.1T), Cell phone ($0.14T), Internet ($75B), landline($25B)
  7. Education services: Cost of college, schools etc amounts to $0.3T
  8. Professional services: These add to about $0.2T. These include legal services ($0.1T), accounting services (tax, etc = $0.05T), various organization dues, burial services, etc.
  9. Public transportation: These amount to $0.2T. These include air ($0.12T), road (cab, bus, etc amounting to $0.06T), rail and water transportation.
  10. Misc expenses: Remaining $1.2T is in many misc categories as social services, religious activities, domestic household services, etc. These also include expenses of non profit institutions which is about $0.5T. Non profit org have gross output of $1.7T and receipt collections of $1.2T, resulting in final consumption expenditure of $0.5T. The reason I think we include expenses of non profit orgs as consumer expenses is not sure. ?FIXME??

 

 

Put details

scikit-learn:

It's an open source machine learning library for python. It's built on on top of SciPy and is distributed under the 3-Clause BSD license. The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. scikit-learn is also known as sk-learn and provides simple and efficient tools for data mining and data analysis. It supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection and evaluation, and many other utilities.

Offical website is:

https://scikit-learn.org/stable/

Install on CentOS 7:

scikit-learn requires:

  • Python (>= 3.6)
  • NumPy (>= 1.13.3)
  • SciPy (>= 0.19.1)
  • joblib (>= 0.11)
  • threadpoolctl (>= 2.0.0)

Scikit-learn plotting capabilities (i.e., functions start with plot_ and classes end with “Display”) require Matplotlib (>= 2.1.1). So, before you install scikit-learn, you need to have Numpy, SciPy and Matplotlib installed. scikit-learn may install it for you if it finds them missing. It does install other nodules for you as well.

Scikit-learn 0.20 was the last version to support Python 2.7 and Python 3.4. scikit-learn 0.23 and later require Python 3.6 or newer. As we will work with python3.6, we'll install scikit-learn 0.23-2 which is the latest version.

Cmd: run below cmd on Linux Terminal:

sudo python3.6 -m pip install -U scikit-learn

Screen messages:

We see following on screen: It first downloads scikit-learn-0.23-2, then it looks for scipy version >= 0.13.3, numpy version >= 1.8.2, and few other python modules. It downloads ones that are needed. It uninstalls ones that are older and replaces them with newer version. As as ex, below I had numpy-1.19.1 installed, but scikit-learn had latest numpy-1.19.2 version, so it uninstalled the older version, and replaced it with newer version.

  Downloading https://files.pythonhosted.org/packages/5c/a1/273def87037a7fb010512bbc5901c31cfddfca8080bc63b42b26e3cc55b3/scikit_learn-0.23.2-cp36-cp36m-manylinux1_x86_64.whl (6.8MB)

Collecting numpy>=1.13.3 (from scikit-learn)
  Downloading https://files.pythonhosted.org/packages/b8/e5/a64ef44a85397ba3c377f6be9c02f3cb3e18023f8c89850dd319e7945521/numpy-1.19.2-cp36-cp36m-manylinux1_x86_64.whl (13.4MB)

Collecting scipy>=0.13.3 (from scikit-learn)
  Using cached https://files.pythonhosted.org/packages/14/92/56dbfe01a2fc795ec92b623cb39654a10b1e9053db594f4ceed6fd6d4930/scipy-1.2.3-cp34-cp34m-manylinux1_x86_64.

Requirement already up-to-date: scipy>=0.19.1 in /usr/local/lib64/python3.6/site-packages (from scikit-learn)
Installing collected packages: joblib, numpy, threadpoolctl, scikit-learn
  Found existing installation: numpy 1.19.1
    Uninstalling numpy-1.19.1:
      Successfully uninstalled numpy-1.19.1
Successfully installed joblib-0.16.0 numpy-1.19.2 scikit-learn-0.23.2 threadpoolctl-2.1.0

Once we see above sucess message, That means scikit-learn is installed on your system. As explained in"modules" section, if the module gets installed correctly, we will see the module in below dir for python3.6:

/usr/local/lib64/python3.6/site-packages/sklearn => This is the scikit-learn dir. We also see a scikit-learn.libs dir which has *.so file (shared object library) and a scikit_learn-0.23.2.dist-info dir, which has all distribution info.

In order to check your installation and to see which version and where scikit-learn is installed, use below cmd:

> python3.6 -m pip show scikit-learn => It gives below o/p showing scikit-learn version 0.23.2 is installed


Name: scikit-learn
Version: 0.23.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: None
Author-email: None
License: new BSD
Location: /usr/local/lib64/python3.6/site-packages
Requires: joblib, threadpoolctl, numpy, scipy

Error Messages:

As explained in "modules" section, if we just type "pip install U scikit-learn", we'll get multiple errors (files not found, etc) as we are not running right version of pip for python 3.6. You may get any of these errors as shown below: (note that even though python3 is soft linked to python3.6, below cmds keep using python3.4. So, it's best to run pip with python3.6 as explained above, and you will get smooth installation)

numpy errors running with python3.4

Building wheels for collected packages: numpy
  Running setup.py bdist_wheel for numpy ... error
  Complete output from command /usr/bin/python3.4 ....

multiple gcc compile errors

  gcc -pthread _configtest.o -o _configtest
  _configtest.o: In function `main':
  /tmp/pip-install-r7v7kemj/numpy/_configtest.c:6: undefined reference to `exp'
  collect2: error: ld returned 1 exit status

  gcc: _configtest.c
  _configtest.c:1:20: fatal error: Python.h: No such file or directory
   #include <Python.h>

 

Usage:

import sklearn: We need to first import sklearn and other modules in any python pgm. These are the imported modules:

import numpy as np
import matplotlib.pyplot as plt
import sklearn

linear model: sklearn has built in regression models to find best fit for given data. More details here:

https://scikit-learn.org/stable/modules/linear_model.html

Linear Regression:

Here (X,Y) data is fitted using weight coefficients. Here Y may be single target, or Y may be multiple targets (i.e Y0, Y1, etc that we are trying to fit simultaneously). Usually Y is a single target for our purposes. Linear regression fits in a linear model to minimize sum of squares of error. LinearRegression will take in its fit method arrays X, y and will store the coefficients of the linear model in it's coef_ member and the bias (or intercept) in it's ntercept_ member. When y is a single target, _coeff is 1D ndarray of shape(num_of_features,), while _intercept is just a float number. When y is multiple target, then _coeff is 2D ndarray of shape(num_of_targets, num_of_features), while _intercept is 1D array of shape(num_targets,). fit(X,y) method takes in 2 arrays, where X is 2D array of shape(num of samples, multiple X attr as X0, X1, and so on). y is a 1D array of output values.

Ex: This tries to fit data (X,y) using Linear regression. y=m0X0 + m1X1 + b

from sklearn import linear_model
reg = linear_model.LinearRegression() #reg is an instance of LinearRegression class (See Object Oriented pgm)
reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2]) #Here X has 2 attr X0, X1, and for each [X0, X1] we have y. So, for X=[0,0], Y=0. Similarly for X=[1,1], Y=1 and so on.
print(reg.coef_, reg.intercept_) => returns array([0.5, 0.5]), 0.1*e-16 . These are the 2 coeff m0 and m1,and intercept b that try to fit the data. so, y=0.5*X0 + 0.5*X1 + b for best fit. b is close to 0 (ideally it should be 0, but computers can't get exact 0). Here _coeff is 1D array while intercept_ is a float as expected

 

Logistic Regression:

This is implemented in LogisticRegression() class. This implementation can fit binary, One-vs-Rest, or multinomial logistic regression with optional , or Elastic-Net regularization. The solvers implemented in the class LogisticRegression are “liblinear”, “newton-cg”, “lbfgs”, “sag” and “saga”.

LogisticRegressionCV implements Logistic Regression with built-in cross-validation support, to find the optimal C and l1_ratio parameters according to the scoring attribute.

ex: It fits data (X,Y) using Logistic Regression where Y=0 or 1 for any given X. X is 2D array, while Y is 1D array, same as in previous linear regression example. The difference is that _coeff and _intercept now are diff shape matrix. _coeff is 2D ndarray of shape(1, num_of_features), while _intercept is 1D array of shape(1,).  Not sure why the matrix are higher dimensions now, even though the data for m, b that they contain is still same style as linear regression.

clf = sklearn.linear_model.LogisticRegressionCV();
clf.fit(X, Y);

print(clf.coef_, clf.intercept_) => prints coefficient matrix + bias (intercept) for the mode that fits this data closest. Prints something like: coeff=[[ 0.02783873 -0.20163637]] intercept=[0.01543046]. NOTE: ceff is 2D array, while intercept is 1D array (different than Linear Regression)

LR_predictions = clf.predict(X) => We can use predict method and apply it on original X dataset to see what predicted Y array it gives out. coefficients stored in clf.coef_ are used for predict method.

LR_predict_probability = clf.predict_proba(X) => This shows the probability for each example in X dataset. It shows it as a pair, where 1st num is probability of matching, while 2nd num is probability of not matching

 

This useless website has all the crap that you ever wanted to search for. If there's any crap that's missing here, or more crap that you want to see, then email me at maaldaar dot support at gmail dot com. You get the idea - use the website name followed by a word "support" (dot is optional), and it's a gmail email id.

This website is hosted on a cheap webserver (hostinger), with very good service (I'm paying about $3/month as of 2023. It was $1/month until 2022). It's running Joomla which is the open source content management software. You will find articles and more details about hosting providers and various content management software on this site in "Internet" section.

This website originated to solve my own problems of keeping track of a lot of things that I come across. I'm bad at remembering things (I mean very bad !!), and with the vast quantity of material having a short life span, I soon realized the need to put it somewhere for my own reference. Also, when learning new things, it's hard to know what are all the things that you should learn and retain, and what to throw away. On this site, I've tried to include things that I think are good basic start and will serve you well in life, no matter what discipline you belong it. Of course being an engineer, my topics are biased towards electrical/computer engineering. Nonetheless, it's all crap, but the hope is that this much crap should suffice for our measly life. If you find topics that are missing from here or need more details or find errors (which there are plenty, unfortunately), leave me a note. There's no limit to what kind of content this website will have. So, no topic is off limit, except "those" topics if you know what I mean foot-in-mouth

The menu on left lists all categories and sub-categories. I've kept the depth to 4 or less, so that you can get to the required article in 4 or less clicks. Assuming I'm able to get at most 10 articles at each depth, a depth of 4 would allow me to have 10^4=10K articles which is more than what I can write in my entire life.

This website doesn't have any advertisements yet (as if advertisers are really lined up to put ads on this site). Most of the hits that you see on the articles are just "bots" visiting my site. I haven't made any comments section for any article. I had a bad experience with spammers and bots, where they took over the comment section with tens of thousands of comments per day. I'll start comment section, once I find a good open source comment module, which can protect me (and my site) from these nasty spammers. Until then, use the email addr above to contact me.

Now to the most important question: Why the name MAALDAAR for this website? Simple, that was the first domain I found available which was 8 characters or less with a .com extension. It's certainly a weird name for a website. It could have been Baaldaar, crapsite or anything random you could think of. As Shakespeare said, "What's in a name?" I think it was Shakespeare, but who cares - whoever said it didn't believe in the importance of a name anyway. Shakespeare didn't live in modern "brand" age. Today, a brand name attached to a crap can make it the hottest item pursued by humanity, so everything is in the NAME !!

In putting my concluding remarks, I hope we end up saving some money and gaining some knowledge - both the steps to being maaldaar (i.e getting wealthier). To quote a dialogue by "Johnny Lever"  from a Bollywood movie "Kala Bazaar"  ----> "Aadmi jitna ganja hota hai, utna hi maaldaar hota hai". For Hindi handicapped people, the translation is "The balder a man is, The richer he is". Even if you believe in it, please do not become "bald"-aar to be maaldaar wink. We got other ways to be that on this site !!

 

PIL/Pillow: PIL = Python Imaging Library.

PIL is free library that allows to process images in various formats, as JPEG, PNG, BMP, etc. However, PIL's last commit was in 2011, and supports Python 1.5 only. Since there was no active development of PIL for Python3, a fork of PIL called PILLOW was released for supporting Python3. PILLOW wanted to completely replace PIL, that is why even though, we install PILLOW, we import it as PIL in python pgm. All the cmds, functions, modules etc in pillow are exactly the same as in PIL, so to end user there is no difference seen. It still looks like PIL is installed, and he can keep on working w/o any changes. All tutorials for PIL work for PILLOW also, since everything is the same. But keep in mind, that there is no PIL for python3, only Pillow. We should not install PIL, just pillow for any version of python going forward. If you had installed PIL, uninstall it, and install pillow as shown below. Keep in mind, that pillow is what we should learn, not PIL.

Pillow installation:

$ sudo python3.6 -m pip install pillow => installs Pillow as PIL. Here we are installing Pillow for python3.6

/usr/local/lib64/python3.6/site-packages/PIL/* => We see PIL dir, and lots of files under it, as well as 2 Pillow files that provides some additional info

ex: import pillow as PIL => Here we import pillow library as PIL, so that in our pgm, as just keep using PIL as if nothing changed.

Other Imaging libraries:  There are other imaging libraries as openCV, Matplotlib, etc. "Matplotlib" tis mostly for plotting, but works on images too. However, Pillow looks more complete in terms of working on most types of images, and Matplotlib documentation also says that it falls back on Pillow for cases where it doesn't have inbuilt support. So, I would invest time in learning Pillow for working on images.

Good tutorial here:

https://www.tutorialspoint.com/python_pillow/index.htm

Syntax:

Pillow library uses image class. There are lot of functions/methods that we can use on Image class.

1. methods open, save, show, resize, etc: These m ethods of class Image are used to open the image, save it, show it and to resize to some other size:

ex: test.py

from PIL import Image # Here we import Image module from Pillow library (even though we say PIL, it loads Pillow if Pillow is installed. However, if PIL is installed, then it will load from PIL, which is not what we want. So, we will have to uninstall PIL and install Pillow)

im = Image.open("images/cuba.jpg") #We use open function to load image into image class "im". We provide path starting from current dir, or we can provide the full path too.

im.show() #This shows image loaded above using show() method

im = im.rotate(45) #rotate image using rotate method

im.show() #Show rotated Image

im.save('beach1.bmp') #This saves the "im" image class as beach1.bmp (in bmp format). We can also specify the format, else it's inferred from file extension. Note that original image is still intact, as we saved with different file name.

resized_image = im.resize((round(im.size[0]*0.5), round(im.size[1]*0.25))=> this resizes the image to 1/2 for height and 1/4 for width. Or we can provide the size explicitly as im.resize(64,64). This is useful in AI, where we want to work with a standard size image.

resized_image.save('resizedBeach1.jpg') #this saves the resized image

2.  using with numpy: image class above can be used with numpy module. We can cnvert image into numpy array and vice versa. This is very useful in AI.

A. convert array into image: fromarray() => used for creating image with numpy array:

ex: below ex creates a 150 by 250-pixel array (height=150, width=250) then fill left half of the array with orange and right half of the array with blue.

from PIL import Image

import numpy as np

arr = np.zeros([150, 250, 3], dtype=np.uint8) #creates a numpy array object of shape (150,250,3) filled with all 0. The innermost triplet holds the R,G,B colors.

arr[: , :125] = [255, 128, 0] # This slicing says for axis=0 choose all range (i.e basically all rows in picture since outermost array is for rows or height). then for axis=1, it selects range from 0 to 125 (it chooses cols 0 to 100 for width. To all these array slice (which happens to be left half), it assigns RGB value as [255,128,0] which corresponds to orange

arr[: ,125:] = [0, 0, 255] #same as above, except that for axis=1, range is from 125 to max (which hapens to be right half). It assigns RGB to blue only

img = Image.fromarray(arr) #forms an image object from given 3D array

img.show()

img.save("RGB_image.jpg")

 B. convert image into array: array() => used to extract image pixels from a picture and convert them into numpy array

ex:

im = Image.open('my_image.jpg')
image_arr = np.array(im) #returns in H X W X C

arr_image = np.array(im) # we can pass this object to numpy array func (explained in numpy module section). So, arr_image becomes a 3D array of height X width X color
print(arr_image.ndim, arr_image.shape, arr_image) => prints "dim=3 shape=(240, 160, 3) arr_image=[ [ [20 47 90] .... [45 67 102] ] ]. so this picture has 240 pixels in height, 160 pixels in width and 3 values for color corresponding to R,G,B


img = Image.fromarray(arr_image) #from above array, form the picture again
img.show() #display orig image

3. misc functions available for plenty of operations on images. Look in tutorialspoints website.