Filesystems, Partitions and devices:

When you get any storage medium, whether it's usb flash drive, or hard disk or cdrom, they all store bits as 1 or 0 in every location of the device. We need to have a way so that we can identify which group of bits are together to form a picture or a song or a text file, etc. This is where the File system (FS) comes into picture. It establishes a particular way of storing data to retrieve it later on.

Very good intro here: https://www.tldp.org/LDP/sag/html/disk-usage.html

HARD DISK:

Each device is rep by separate device file. Hard disk interface are either IDE hard disk or SCSI hard disk. SCSI was the earliest hard disk, followed by IDE. IDE were parallel ATA or PATA i/f (parallel AT attachment). These have largely been replaced by SATA (or serial ATA) i/f since it's intro in 2000, due to lower pin count, reduced complexity and higher speeds. SATA rev 1 had max transfer speed of 1.5Gbits/sec, while rev2 had 3.0Gbits/sec. Latest rev3 has transfer speed of 6.0Gbits/sec. SATA pin i/f consists of 7 data pins and 15 power pins in 2 set of pins. Out of 7 data pins, 3 are gnd pins, and remaining 4 are differential tx/rx pins (A+/A- for transmit, B+/B- for receive). differential signaling reduces noise at high speed transmissions. 15 power pins have 3.3V/5.0V/12.0V power supply pins along with ground pin. Smaller SATA devices combine data pins and power pins in the same set, with only 5V supply lines provided to reduce pin count.

Now we have SSD Hard disk, which have very different i/f, and much faster speeds.

Disk partition:

Each HD can be divided into several partitions, which allows different partitions of the same disk to behave as separate hard disks. For all practical purposes, partitions can be thought s separate hard disk. So, we'll talk at partition level.

First sector of each HD (NOT partition) contains the partitioning info. This 1st sector is called master boot sector.

Very good intro here: https://www.minitool.com/lib/boot-sector.html

some more info here: https://www.bydavy.com/2012/01/lets-decrypt-a-master-boot-record/

Boot sector (or Master Boot Sector or Master Boot record (MBR) is more common term) usually refers to the first sector of hard disk. It's used for loading and transferring processor control right to operating system. After power up, boot process starts from ROM on motherboard, and then control is transferred to hard disk. This boot sector is the very first location read from hard disk.

MBR contains a small program that reads the partition table, checks which partition is active (that is, marked bootable), and reads the first sector of that partition, the partition's boot sector (the MBR is also a boot sector, but it has a special status and therefore a special name). This boot sector contains another small program that reads the first part of the operating system stored on that partition (assuming it is bootable), and then starts it. Control is transferred to OS at this point.

MBR is all contained in 1st sector of hard disk (cyl 0, head 0 sector 1). All sectors are 512 bytes.

The original partitioning scheme for PC hard disks allowed only four partitions. This quickly turned out to be too little in real life. to overcome this, primary partition was allowed to be subdivided further into more partitions, which were called logical partitions. Such a primary partition was called extended partition. First sector of each primary partition as well that of the disk is called boot sector, and contains boot pgm.

Each partition and extended partition has its own device file. The naming convention for these files is that a partition's number is appended after the name of the whole disk, with the convention that 1-4 are primary partitions (regardless of how many primary partitions there are) and number greater than 5 are logical partitions (regardless of within which primary partition they reside). Linux FS uses block, instead of sectors, where each block size=1024 bytes=2 sectors.

NOTE: size conversion are 1024 bytes = 1KB, 1024 KB=1MB, 1024MB=1GB. However, almost all linux cmds approximate 1024 with 1000 for easy calculation, so sizes may differ b/w diff cmds. So, 1GB is treated as 1000MB, and 1MB is treated as 1000KB, and 1KB is treated as 1000 bytes, even though 1 block = 2 sectors = 1024 bytes. So, this results in an error of about 10% compared to exact calculation.

Naming convention:

In Windows, each partition is assigned a "drive letter name" as C:/ (called as C drive), etc. In Linux, everything is treated as a file. So, all devices and partitions are essentially files. All devices are given file names in the form /dev/xxyN, where xx=hd (for IDE disks) and sd (for SCSI disks), y=a,b,c etc indicating 1st disk, 2nd disk, etc, while N=partition number on that disk. So, /dev/sda => refers to 1st disk, while /dev/sda2 refers to 2nd partition on 1st disk. We use "sd" instead of "hd" even though hard disks are now all SATA IDE (probably due to legacy)

fdisk: cmd "fdisk" provides all info about devices. It's used for disk partition mgmt/info too.

prompt> sudo fdisk -l

Disk /dev/sda: 128.0 GB, 128035676160 bytes, 250069680 sectors => Here hard disk is 128GB in size. It's dual boot laptop, which has Windows and Linux installed. If we treat each sector as 1024 bytes, and be exact in our calculations, then 128035676160 bytes = 250069680 sectors = 28035676160/1024KB = 125034840KB = 125034840/1024MB = 122104.3 MB = 122104.3/1024 GB = 119.24GB. However, hard disk is reported as 128GB, which uses multiplication of 1000 instead of 1024, so 128035676160 bytes = 128.03 GB.
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x948dbc73

   Device Boot      Start         End                    Blocks        Id  System
/dev/sda1   *        2048     1026047                512000     7   HPFS/NTFS/exFAT => primary partition is from sda1 to sda4. sda1 size=512K blocks*1024/1000KB=524.3MB (assuming factors of 1000 instead of 1024). This is Windows system boot partition. It has boot, EFI dir in it.
/dev/sda2         1026048   166098943      82536448     7   HPFS/NTFS/exFAT => size=82536K blocks* 1024/1000=84516MB=84.5GB. This is Windows system partition, where all user pgm, etc are kept. This is the partition we are in when we log into our windows m/c.
/dev/sda3       248018944   250066943      1024000   27  Hidden NTFS WinRE => sda3 size=1024K blocks*1024/1000=1048.6MB=1.05GB. This is Windows Recovery partition.
/dev/sda4       166098944   248018943    40960000    5   Extended => 4th primary partition is an extended partition and contains 2 logical partitions, sda5 and sda6. size=40960K blocks *1024/1000 = 41.94GB. This extended partition is the one where we have Linux installed. It contains Linux OS (OS+user area) and memory swap space
/dev/sda5       166100992   168198143      1048576   83  Linux => This is first logical partition of extended partition. size=1048K blocks*1024/1000= 1073MB=1.07GB
/dev/sda6       168200192   248018943    39909376   8e  Linux LVM => 2nd logical parition. size=39909K blocks*1024/1000 = 40866MB=40.86GB. This contains centos-root and centos-swap spaces as indicated below. Total blocks=35807,232+4096,000=39903232 blocks, pretty close to those indicated above

Partition table entries are not in disk order => NOTE: sda1, sda2, sda4 and sda3 occupy continuous sectors w/o any missing sectors in b/w them.

Disk /dev/mapper/centos-root: 36.7 GB, 36666605568 bytes, 71614464 sectors => blocks=35807232. so, size=36.7GB (due to rounding of 1000)
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/mapper/centos-swap: 4194 MB, 4194304000 bytes, 8192000 sectors => blocks=4096000. so, size=4194MB (due to rounding of 1000). Note: I chose 4GB swap space during installation of CentOS, which equated to 4*1024*1024KB=4194304KB=4194304000 bytes (where 1GB=1024MB, 1MB=1024KB, 1KB=1000 bytes. So, installation process calculates GB differently than the one calculated by Linux OS)
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

parted: parted is another cmd for mgmt/info of disk partitions. It is most commonly used for partitioning of hard disks.

prompt> sudo parted -l => this also uses rounding of 1000, same as that of fdisk
Model: ATA SAMSUNG MZ7TD128 (scsi)
Disk /dev/sda: 128GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start   End     Size    Type      File system  Flags
 1      1049kB  525MB   524MB   primary   ntfs         boot
 2      525MB   85.0GB  84.5GB  primary   ntfs
 4      85.0GB  127GB   41.9GB  extended
 5      85.0GB  86.1GB  1074MB  logical   xfs                     => This shows logical partition 5 has 1 GB size, same as what fdisk showed.
 6      86.1GB  127GB   40.9GB  logical                lvm        =>  Here, logical partition 6 shows 40.9GB size, same as what fdisk showed
 3      127GB   128GB   1049MB  primary   ntfs         diag


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/centos-swap: 4194MB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system     Flags
 1      0.00B  4194MB  4194MB  linux-swap(v1)


Model: Linux device-mapper (linear) (dm)
Disk /dev/mapper/centos-root: 36.7GB
Sector size (logical/physical): 512B/512B
Partition Table: loop
Disk Flags:

Number  Start  End     Size    File system  Flags
 1      0.00B  36.7GB  36.7GB  xfs

------

 

dd: To copy contents of any device, we can use "dd" cmd. dd copies raw bytes from specified location, and has nothing to do with FS.

ex: sudo dd if=/home/file1.img of=/dev/sdb status=progress oflag=sync bs=4M => this copies file1.img from i/p file to of file /dev/sdb. Here device to be written to is the device name and not the prtition name, as we want to write the file to the device, which will create multiple partitions as needed. If we arite the file to a particular partition then booting from this new device may not work, as the boot may require particular info in main partition to boot up. option status shows status progress (otherwise no status is shown, so the device will seem stuck as it takes couple of minutes for large files to be copied), block size (bs) to transfer is set to 4M (default is 512  Bytes which will take a long time to transfer multi GB file). sudo is used since usually this cmd requires root privilege

To view contents of MBR, type below 2 cmds in a terminal:

prompt> dd if=/dev/sda of=mbr.bin bs=512 count=1 => dd cmd copies contents of any device byte by byte. if=input file, of=output file, bs=size in bytes to rd/wrt at a time, count=num of input blocks to copy. So, this cmd copies 512 bytes from 1st sector of /dev/sda device (usually hard disk) to file mbr.bin.

prompt> od -xa mbr.bin" => od dumps contents of file in octal or other formats. -x => imples hex format, -a implies print ascii char for each byte. On my hard disk, it lists below content (each line has 16 bytes):

000000    63eb    d090    00bc    8e7c    8ec0    bed8    7c00    00bf => The following 446 bytes taken up by master boot program of hard disk is from 0x0000 hex to 0x01BD hex. 2 bytes are written as pair where MSB byte is written before LSB byte, so within a pair, order is reversed, i.e 63eb has 1st byte as eb, and 2nd byte as 63, d090 has 1st byte as 90, and 2nd byte as d0, and so on ... This code contains the bootstrap loader code, which is either GRUB or ...? FIXME?
         k   c dle   P   < nul   |  so   @  so   X   > nul   |   ? nul
000010    b906    0200    f3fc    50a4    1c68    cb06    b9fb    0004
       ack   9 nul stx   |   s   $   P   h  fs ack   K   {   9 eot nul
000020    bebd    8007    007e    7c00    0f0b    0e85    8301    10c5
         =   > bel nul   ~ nul nul   |  vt  si enq  so soh etx   E dle
000030    f1e2    18cd    5688    5500    46c6    0511    46c6    0010
         b   q   M can  bs   V nul   U   F   F dc1 enq   F   F dle nul
000040    41b4    aabb    cd55    5d13    0f72    fb81    aa55    0975
         4   A   ;   *   U   M dc3   ]   r  si soh   {   U   *   u  ht
000050    c1f7    0001    0374    46fe    6610    8000    0001    0000
         w   A soh nul   t etx   ~   F dle   f nul nul soh nul nul nul
000060    0000    0000    faff    9090    c2f6    7480    f605    70c2
       nul nul nul nul del   z dle dle   v   B nul   t enq   v   B   p
000070    0274    80b2    79ea    007c    3100    8ec0    8ed8    bcd0
         t stx   2 nul   j   y   | nul nul   1   @  so   X  so   P   <
000080    2000    a0fb    7c64    ff3c    0274    c288    be52    7c05
       nul  sp   {  sp   d   |   < del   t stx  bs   B   R   > enq   |
000090    41b4    aabb    cd55    5a13    7252    813d    55fb    75aa
         4   A   ;   *   U   M dc3   Z   R   r   = soh   {   U   *   u
0000a0    8337    01e1    3274    c031    4489    4004    4488    89ff
         7 etx   a soh   t   2   1   @  ht   D eot   @  bs   D del  ht
0000b0    0244    04c7    0010    8b66    5c1e    667c    5c89    6608
         D stx   G eot dle nul   f  vt  rs   \   |   f  ht   \  bs   f
0000c0    1e8b    7c60    8966    0c5c    44c7    0006    b470    cd42
        vt  rs   `   |   f  ht   \  ff   G   D ack nul   p   4   B   M
0000d0    7213    bb05    7000    76eb    08b4    13cd    0d73    845a
       dc3   r enq   ; nul   p   k   v   4  bs   M dc3   s  cr   Z eot
0000e0    0fd2    de83    be00    7d85    82e9    6600    b60f    88c6
         R  si etx   ^ nul   > enq   }   i stx nul   f  si   6   F  bs
0000f0    ff64    6640    4489    0f04    d1b6    e2c1    8802    88e8
         d del   @   f  ht   D eot  si   6   Q   A   b stx  bs   h  bs
000100    40f4    4489    0f08    c2b6    e8c0    6602    0489    a166
         t   @  ht   D  bs  si   6   B   @   h stx   f  ht eot   f   !

000110    7c60    0966    75c0    664e    5ca1    667c    d231    f766
         `   |   f  ht   @   u   N   f   !   \   |   f   1   R   f   w
000120    8834    31d1    66d2    74f7    3b04    0844    377d    c1fe
         4  bs   Q   1   R   f   w   t eot   ;   D  bs   }   7   ~   A
000130    c588    c030    e8c1    0802    88c1    5ad0    c688    00bb
        bs   E   0   @   A   h stx  bs   A  bs   P   Z  bs   F   ; nul
000140    8e70    31c3    b8db    0201    13cd    1e72    c38c    1e60
         p  so   C   1   [   8 soh stx   M dc3   r  rs  ff   C   `  rs
000150    00b9    8e01    31db    bff6    8000    c68e    f3fc    1fa5
         9 nul soh  so   [   1   v   ? nul nul  so   F   |   s   %  us
000160    ff61    5a26    be7c    7d80    03eb    8fbe    e87d    0034
         a del   &   Z   |   > nul   }   k etx   >  si   }   h   4 nul
000170    94be    e87d    002e    18cd    feeb    5247    4255    0020
         > dc4   }   h   . nul   M can   k   ~   G   R   U   B  sp nul                      => NOTE: we see text "GRUB" here meaning it has GRUB BootLoader
000180    6547    6d6f    4800    7261    2064    6944    6b73    5200
         G   e   o   m nul   H   a   r   d  sp   D   i   s   k nul   R                          => NOTE: we see text "geom Hard Disk Read Error". This is for printing Error by the bootloader.
000190    6165    0064    4520    7272    726f    0a0d    bb00    0001
         e   a   d nul  sp   E   r   r   o   r  cr  nl nul   ; soh nul
0001a0    0eb4    10cd    3cac    7500    c3f4    0000    0000    0000 =>
         4  so   M dle   ,   < nul   u   t   C nul nul nul nul nul nul
0001b0    0000    0000    0000    0000    bc73    948d    0000   => Master boot pgm code ends here at 0x01bd (total 446 bytes), last 6 bytes are supposed to be 4 bytes of Disk Id/signature (optional) followed by null "0000" (order is incorrectly mentioned in the link above). Here disk signature is "948d bc73" from MSB to LSB, which is same as what was reported by fdisk cmd above = 0x948dbc73

0001be  2080 => These 16 bytes are partition 1. 1st byte is status. It's 80 for bootable partition and 00 for non bootable one. Here, this partition is bootable as status=80, other 3 are non bootable. It's bootable as this is the boot/EFI partition from windows.
0001c0    0021    dd07    3f1e   0800    0000    a000    000f   => 5th byte is partition type/Id which indicates the type of parition and hence the FS. It is for use by OS. It's not standardized, and Linux OS just ignores it. Here it's 0x07, which is Microsoft/IBM FS (NTFS, exFAT, etc). size=524MB, FS=NTFS.

0001ce    dd00 => These 16 bytes are partition 2.
0001d0    3f1f    fe07    ffff    a800    000f    d000    09d6  => Partition type = 0x07. FS=NTFS

0001de    fe00 => These 16 bytes are partition 3.
0001e0    ffff    fe27    ffff    7800    0ec8    4000    001f   => Partition type = 0x27, which is Winodows Recovery partition. size=1GB, FS=NTFS,

0001ee    fe00 => These 16 bytes are partition 4.
0001f0    ffff    fe05    ffff    7800    09e6    0000    04e2  => Partition type = 0x05, which is extended partition type. This has 2 logical partitons, as explained in fdisk cmd above

0001fe   aa55 => these last 2 bytes of 512 bytes "55AA" marks end of MBR. It is used as a signature for MBR. 0x55 is 511th byte(addr 0x01fe), while 0xAA is 512th byte (addr 0x01ff)

After the master boot sector, the next sectors contain the actual partitions. Parttion Typpe/Id is specified for each partition, in the MBR. Value in this specifies the FS, and originated from IBM/Microsoft in early days of computer. However, it's ignored by Linux.

file: file cmd figures out the type of a file, since it's hard for a user in linux to know file type (since extensions have no meaning in linux)

file mbr.bin => displays contents of file that was dumped from MBR above

mbr.bin: x86 boot sector; => This says that this file is a boot sector file. It then knows how to read partition info from remaining bytes

partition 1: ID=0x7, active, starthead 32, startsector 2048, 1024000 sectors; => 1000K*0.5KB/sector=500MB

partition 2: ID=0x7, starthead 221, startsector 1026048, 165072896 sectors; => 78.7GB

partition 3: ID=0x27, starthead 254, startsector 248018944, 2048000 sectors; => 1GB. This is windows recovery partition.

partition 4: ID=0x5, starthead 254, startsector 166098944, 81920000 sectors,=> 80000K*0.5KB=40000MB=40GB

code offset 0x63, OEM-ID "      м", Bytes/sector 190, sectors/cluster 124, reserved sectors 191, FATs 6, root entries 185, sectors 64514 (volumes <=32 MB) , Media descriptor 0xf3, sectors/FAT 20644, heads 6, hidden sectors 309755, sectors 2147991229 (volumes > 32 MB) , physical drive 0x7e, dos < 4.0 BootSector (0x0)

Now, we can repeat same exercise for external ssd hard disk connected via usb.

prompt> dd if=/dev/sdb of=ssd.bin bs=512 count=1=> assuming external hard disk is on /dev/sdb

prompt> hexdump ssd.bin => This is hexdump of 1st sector of this external ssd hard disk connected via usb. It shows 1st sector as boot sector, and shows the only partition it has (partition 1)

0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00001b0 0000 0000 0000 0000 9c01 078d 0000=> since there is no bootstrap code here, most entries are 0. Here disk signature is "078d 9c01" from MSB to LSB

0001be 2000 => these 16 bytes are partition 1. 1st byte=00 => it's non-bootable partition
00001c0 0021 fe07 ffff 0800 0000 9da7 773b

0001ce 0000 => partition 2 to partition 4 are all 0 => no more partitions
00001d0 0000 0000 0000 0000 0000 0000 0000 0000
*
00001f0 0000 0000 0000 0000 0000 0000 0000 aa55 => aa55 indicates end of MBR

file ssd.bin => shows file type of MBR file from external hard disk

ssd.bin: x86 boot sector; partition 1: ID=0x7, starthead 32, startsector 2048, 2000395687 sectors, extended partition table (last)\011, code offset 0x0 => size=2000395687*0.5KB/sector=1,000,197,843.5KB = 1TB

Filesystems (FS): It's a very important concept especially for linux, as everything in Linux is treated as  file. A FS is the methods and data structures that an OS uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk. FS are stored on partition of disk, and only 1 kind of FS is allowed on a particular partition. Thus, for ex, ext3 FS might be stored on partition 1 of HDD.

Unix filesystems are arranged as a big tree, with tree hier rooted at "/" or root.These files can be spread out over several devices and they can be remote or local file system. Linux supports numerous file system types.


These are the most common disk file systems (file systems suited for use on disk storage media).
1. Ext (extended FS): File systems started with Minix file systems, then was enhanced to Ext in 1992 (solved max parttition size problem). Newer versions came later as Ext2, Ext3, Ext4.
2. Ext2: enhanced version of Ext which supported inode modification, timestamps, etc.
3. Ext3: most popular enhanced version of ext2 supporting journalism. Std linux FS on most dist . Can support 2TB max file size, and 16TB max Partition size.
4. Ext4: Can support 16TB max file size, and 1EB max Partition size. Not supported on windows.
5. FAT32 (FAT=File Allocation System, proprietry windows FS): FAT was originally designed in 1970 for floppy drives. limits file size to 4.2GB, and parttition size to 2TB. Windows limits parttition size to 32GB. It's the only one that was supported on Windows systems staring from Windows95 in 1995. FAT32 is very popular on USB flash drives, where they come formatted with FAT32 FS, FAT32 FS is supported across all OS and all devices. However due to file size limitation, new devices come commonly formatted with NTFS.
6. NTFS (Windows New Tech (NT) FS, improved FAT32 proprietry windows FS): File size limit is 2TB, while partition size limit is 256TB. Most widely used as an alternative to FAT32, since NTFS-3G driver is provided with most linux dist for rd/wrt. Default for Ubuntu, Windows NT(1993), Windows 2000(XP onwards). partition tables on MBR (Master Boot Record) disks only support max partition size of 2TB, GPT volumes used to get to larger partition size.

7. exFAT: known as extended FAT, it was introduced in 2006 for USB flash drives and memory SD cards. It's used in places where NTFS is not feasible due to data structure overhead, but file sizes > 4GB are needed, so FAT32 is not possible. exFAT is now standard FS for SDXC cards > 32GB size. Although drivers for this are included in Linix Kernel 5.4, most linux distro don't have drivers for exFAT.
Misc:
7. ISO9660, Universal disc format(UDF) => optical disc file system for use in CD/DVD)
8. IBM DB2: database based file system. Instead of hier structured mgmt, files here are identified by rich metadata as type, topic, etc.
9. NFS :network file system in which one machine (client) requires access to data stored on another machine (NFS server). It's an open std, and was developed by SUN in 1984. server runs nfsd daemon on some port, client connects to that port and requests for data, which is passed to client by nfsd.

Making FS: Before a partition or disk can be used as a filesystem, it needs to be initialized, and the bookkeeping data structures need to be written to the disk. This process is called making a filesystem. Various data structures such as superblock, inode , data block, directory block , and indirection block are kept in various sectors of partition to allow Linux OS to retrieve files.

mkfs: This cmd used as a front end i/f for making various FS. Most linux distro use mkfs wrapper which calls mke2fs cmd, which does the real FS creation.

If we type mkfs and hit tab, we see all supported FS. i.e mkfs.ext2 creates ext2 FS, similarly for others. Their option differ slightly, so read doc for each of them when using.

ex: mkfs.ext2 -c /dev/fd0H1440 => -c searches for bad blocks, and initializes the bad block list.

ext2 FS has following structure: https://www.nongnu.org/ext2-doc/ext2.html

The best way to learn making a FS, and see what contents get created in it in various sectors, is to follow this link: https://www.howtogeek.com/443342/how-to-use-the-mkfs-command-on-linux/

 ex: First create a image file with all 0 in it.Then create a FS by using mkfs.ext2 cmd. Then we can mount this image file just like any other FS.

> dd if=/dev/zero of=~/fs.img bs=1M count=50 => This creates a 50MB image file (with 1MB block size) called fs.img with all 0s in it.

> ls -al ~/fs.img => To show size of file
-rw-rw-r--. 1 kailash kailash 52428800 Dec 10 03:56 /home/kailash/fs.img => 50MB (=50*1024*1024Bytes=52428800 bytes)

> file ~/fs.img
/home/rakesh/fs.img: Linux rev 1.0 ext2 filesystem data (mounted or unclean), UUID=1ca78acd-c8c4-4e02-b4db-10d76395f252 => assigns a random 16 byte UUID

> mkfs.ext2 ~/fs.img => asks for confirmation, and then creates ext2 FS

> sudo mkdir /mnt/tmp_dev
> sudo mount ~/fs.img /mnt/tmp_dev => mounts the FS at /mnt/tmp_dev

> sudo cp ~/tmp.txt /mnt/tmp_dev/. => this copies tmp.txt file to newly mounted FS. This FS now behaves like any other dir on our hard disk

> hexdump -n2048 ~/fs.img => this dumps first 2048 bytes of FS. As can be seen, first 1024 bytes (block 0) are all 0, implying no boot record is present. then next 1024 bytes (block 1) has superblock, and then other data in rest of the blocks. The superblock is always located at byte offset 1024 from the beginning of the file, block device or partition formatted with Ext2 and later variants (Ext3, Ext4).  

NOTE: Not all disks or partitions are used as filesystems. A swap partition, for example, will not have a filesystem on it. Linux boot floppies don't contain a filesystem, only the raw kernel. Not having a FS saves space. Raw disks are used to make image copies of them (using dd cmd).

Linux cmds:

1. lsblk = list block: lists information about all available block devices, however, it does not list information about RAM disks. Examples of block devices are a hard disk, flash drives, CD-ROM etc. So, this is a very useful cmd to see all available storgae devices.

$ lsblk => Shows all devices with thier major:minor number, RM=1 implies removable device, RO=1 implies readonly, TYPE implies whether it's disk, partition within a disk (part), rom memory, etc. We'll talk about mount point later. option -p prints full device path, as /dev/sda etc, else it only shows sda, sdb, etc.

prompt> lsblk ( on my dual boot linux/windows laptop)
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda               8:0    0 119.2G  0 disk => sda usually is the main hard disk from which the OS boots. At Linux isntallation time, we usually divide this hard drive into separate partitions (over here in 6 partitions, some of which are Windows partitions, 4 are primary partitions, 2 are logical partition)
├─sda1            8:1    0   500M  0 part /run/media/rajesh/System => on Linux Mint, it's mounted on /media/rajesh/System. It's partition1, the Windows bootable partition in MBR table.
├─sda2            8:2    0  78.7G  0 part /run/media/rajesh/Windows => It's partiton2, where windows OS is insatlled
├─sda3            8:3    0  1000M  0 part => It's partition3, windows recovery partition
├─sda4            8:4    0     1K  0 part
├─sda5            8:5    0     1G  0 part /boot
└─sda6            8:6    0  38.1G  0 part
  ├─centos-root 253:0    0  34.2G  0 lvm  /
  └─centos-swap 253:1    0   3.9G  0 lvm  [SWAP]
sr0              11:0    1  1024M  0 rom  

 

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931.5G  0 disk
├─sda1   8:1    0   260M  0 part /boot/efi => sda1, sda2 etc refer to partition within the disk sda
├─sda2   8:2    0    16M  0 part
├─sda3   8:3    0 246.7G  0 part /media/rajesh/Windows => This is windows partition, since we store Windows OS in a separate partiton than Linux OS, although both are residing on same physical disk.
├─sda4   8:4    0   980M  0 part
├─sda5   8:5    0   7.5G  0 part [SWAP] => here 7.5G of sda was allocated for SWAP and separate partition called "sda5" was created for it.
└─sda6   8:6    0 676.1G  0 part / => Here almost all of disk, 676G, was allocated to root file system "/". This is where everything in linux is stored, starting from root dir.

sdb               8:16   0 953.9G  0 disk  => This is external ssd drive connected via usb
└─sdb1            8:17   0 953.9G  0 part /run/media/rajesh/samsung-ssd => It has 1 partition mounted on path shown

sr0     11:0    1  1024M  0 rom  => This is another device "rom" (usually cd drive on computer) named as sr0. On almost all computers, we'll see atleast these 2 devices, sda and sr0.

lsblk -f => This lists the FS for each partition as well as the UUID of each device. UUID is 128 bit universal unique id, assigned

Mounting under linux:

In linux, you see devies under /dev/* (i.e lsblk devices listed as sda etc are in /dev/sda ....). However, if we try to access /dev/sda contents directly, we see that it's just a link, and we get a message "not a regular file". All these  refer to a device file. For ex /dev/cdrom refers to CD ROM device. This is not the contents of whatever disc you might wish to insert into your optical drive, but rather it is a reference to the bit of hardware (and probably software drivers) that you might call on to show that to you. We can use "dd" cmd to read the contents of all of this device as a big string of bytes (known as disk image), however it's not very meaningful info. FS type tells us how to interpret that string of bytes. In linux, we use cmd "mount" to ask OS to figure out FS of this device and interpret contents based on that. mount attches FS found on some device to the big file tree, so that we can access it as if it's one tree at /somedir/. When you mount /dev/cdrom to some path in your tree you attach its contents to your file system.

ex: mount-t iso9660 /dev/cdrom /media/cdrom => mount cmd tells the system: "take this very long string of bytes that you have in /dev/cdrom, interpret it as a directory tree in the iso9660 format (boot sector of device/partition has necessary info to help interpret the FS on the device), and allow me to access it under the location /media/cdrom. We create this dir "/media/cdrom". It can be any dir anywhere on the system. By convention, FS are usually mounted in /media dir. -t is optional, as mount can itself figure out the FS by reading first few bytes of that device/partition,

mount cmd uses a lot of heuristics to figure out FS of device, which may not always work. That's why it's sometimes necessary to provide -t option to specify the FS explicitly.

ex: sudo umount /media/cdrom => This unmounts whatever was mounted at mount point "/media/cdrom, so that the FS is no more acccessible


ex: mount => shows all FS mounted on this device (FS starting at /)
/dev/hda1           on /       type ext3 (rw,acl,user_xattr) => shows that / FS is mounted on hda1 physicaly attached to m/c (hard drive 1), i.e all of partition /dev/sda1 is a FS accessible with "/"
/dev/cciss/c0d0 on /tmp type ext3 (rw,acl,user_xattr) => /tmp is mounted on another drive c0d0 physicaly attached to m/c.
duke4.abc.com:/vol/fvol528/kagr on /home/rakesh type nfs (rw, ..) => /home/rakesh is mounted on diff m/c duke4.abc.com. This is nfs FS (as it's on diff m/c)

ex: mount => on all laptop shows a lot of FS mounted. imp ones are:

/dev/sda1 on /run/media/rakesh/System type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)
/dev/sda2 on /run/media/rakesh/Windows type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

/dev/sda5 on /boot type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

/dev/mapper/centos-root on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota) => /centos-root is mounted on Root dir "/".  

/dev/mmcblk0p1 on /media/sd_card type fuseblk (rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096) => This is sd card that I mounted by using "mount" cmd

proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) => shows proc is mounted on /proc (even though /roc is not a FS)

 Partitioning HD:

Usually we want to partition our HD into sevral partitions to keep them safe from each other, i.e crash in one partition won't damage the other partition. General practice is to have separate partitions for /, /boot, and swap. You cannot create separate partitions for the following directories: /bin, /etc, /dev, /initrd, /lib, and /sbin. The contents of these directories are required at bootup and must always be part of the / partition.

It is also recommended that you create separate partitions for /var and /tmp. This is because both directories typically have data that is constantly changing. Not creating separate partitions for these filesystems puts you at risk of having log file fill up our / partition.

-------

disk usage cmds

1. df => shows free disk space (df=disk free)
df -kT =>shows amount of diskspace available on file system (for all of the filesystem). -T option shows file system type too. -k shows size in 1K blocks (default in RHEL)
Ex: /home/rajesh> df -T | grep kagrawal => df shows all filesystems, to see diskspace for kagrawal only, we do a grep
near13.srv.abc.com:/vol/fvol156/kagrawal => implies kagrawal has home dir on this server.
nfs 1024000 512416 511584 51% /home/kagrawal
#nfs shows that file system type is nfs => shown only with -T option
#the first column is space allocated yo your unix home dir the second column is space you have used
#the third column is space which is available to you and the percentage you have used
Ex: /home/rajesh> df . => shows usage for current dir. To see usage for some other dir, type: df dir1/dir2
Ex: /proj/dsp/Testbenches $ df .
Filesystem 1K-blocks Used Available Use% Mounted on
coupe.abc.com:/vol/fvol32/proj1 => shows the root dir for the current dir and where it's mounted
5242880 5242848 32 100% /proj/name2 => size is 5.2GB

ex: /home/rajesh> df => on a laptop, this is what it looks like
Filesystem              1K-blocks     Used Available Use% Mounted on
/dev/mapper/centos-root  35789748 18501240  17288508  52% / => centOS-root FS is mounted on /

/dev/sda5                 1038336   174224    864112  17% /boot
tmpfs                      770336       68    770268   1% /run/user/1000
/dev/sda1                  511996    54848    457148  11% /run/media/kailash/System
/dev/sda2                82536444 44197628  38338816  54% /run/media/kailash/Windows
/dev/mmcblk0p1           62504960  1622400  60882560   3% /media/sd_card

2. du => disk usage

du -skh dir1/* => Summarize disk usage of each FILE, recursively for directories. Reports for current dir and below.
-k says report in block_size=1K, while -h says to report it in human readable format (as 234M, etc)
-s summarizes directories (reports only 1 line with total size for each matching dir)
du -sk * .??* | sort -n => to sort all files/dir (including dot files) in your current dir
du -ch | grep total => lists 1 line with total size of dir

Installing File system drivers on Linux: Each of these devices or partitons within a device can have it's own separate FS. However, the OS has to know how to read/wrt these FS, else it may not work on partitons with different FS, than what it knows. Usually Linux OS know ext and FAT/NTFS, and work pretty well with them. Since most devices already come formatted with NTFS, Linux has no problems readingwriting to these. However, we may need to install drivers for certain FS if they are not installed by default.

installing NTFS drivers:

sudo yum install epel-release => Nux repository depends on the EPEL software repository. If the EPEL repository is not enabled, enable it using this cmd

sudo yum install ntfs-3g -y; => available in yum repo

Once installed, linux OS should automatically be able to mount the NTFS drive. If not, we can always manually mount by using mount cmd

installing exFAT drivers: 

Most linux distro don't provide support for exFAT FS. So, we can get an error "cannot mount exFAT" when we insert a device which has exFAT FS. To be able to mount exFAT filesystem on CentOS you’ll need to install the free FUSE exFAT module and tools which provide a full-featured exFAT file system implementation for Unix-like systems. However, yum repository for CentOS don't haave these exFAT pacakages available. However, these pkg are available from Nux Dextop (3rd party) repository, and installed using yum with steps as below:

sudo yum install epel-release => Nux repository depends on the EPEL software repository. If the EPEL repository is not enabled, enable it using this cmd
sudo rpm -v --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro  => import the repository GPG key
sudo rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm => enable the Nux repository by installing rpm pkg
sudo yum install exfat-utils fuse-exfat => now install the exfat-fuse and exfat-utils packages from nux repo

Once installed, linux OS should automatically be able to mount the exFAT drive. If not (i.e we don't see any response on inserting SD card), we can always manually mount by using mount cmd:

> lsblk => This shows that SD acrd is being recognized by system, but it's not able to read it's contents (as we don't see it under devices in gui)

mmcblk0         179:0    0  59.6G  0 disk
└─mmcblk0p1     179:1    0  59.6G  0 part /media/sd_card

> sudo mount -t exfat /dev/mmcblk0p1 /media/sd_card => As soon as we run this cmd, we see SD card shows up under devices in gui (sometimes it's necessary to specify FS using -t, as OS may still have trouble identifying the FS of the device, even though the FS drivers might be installed, this is due to various heuristics being used to figure out the file system)
FUSE exfat 1.3.0

> sudo umount /media/sd_card => once done, we should unmount the sd card, before ejecting it out.

 

---------------------------

 

 

Chemistry:

Chemistry deals with all elements, molecules, compounds, their properties and their reactions with each other. Basically, chemistry is the only field of science which deals with things that we cannot see with our eyes

Few good links for chemistry:

This section will deal with many topics in the link above. We'll talk about atoms, molecules, compounds and then complex compounds.

 

 

X window system (X11 or X, but not X windows) is a windowing system for bitmap displays. X provides basic framework for a GUI env: drawing and moving windows on the display device and interacting with mouse and keyboard. It originated at MIT in 1984. XFree86 is the implementation of X that was used on all Linux computers until mid 2000. However due to change in it's license terms, a fork was created and development moved on to X.org. The first version, X11R6.7.0, was forked from XFree86 version 4.4 RC2. X.org foundation leads this project as of today, and almost all Linux OS have moved on to X.org. It's available as open source under MIT license.

Official website: https://www.x.org/wiki/

Older defunct website for xfree86 (not supported anymore, so do not use this website): http://www.xfree86.org/

good intro here: http://linuxdocs.org/HOWTOs/XWindow-Overview-HOWTO/index.html

other detailed link here: http://www.linux-tutorial.info/modules.php?name=MContent&pageid=98

X uses client/server model. Usually, the local computer runs X server program. X server interactes with hardware, it takes i/p from mouse/keyboard and displays on the monitor. Programs such as Xterm, browser, etc are called X clients. X server communicates with various client programs. The server accepts requests for graphical output (windows) and sends back user input (from keyboard, mouse, or touchscreen). X clients don't interact with hardware directly. This way X clients need not worry about h/w, they just follow the protocol to interact with X server, and the X server handles the actual process of interacting with h/w.

X provides the library, Xlib, which handles all low-level client-server communication tasks. The client has to invoke functions contained within Xlib to get work done.

Windows manager (WM):

A window manager is a "meta-client", whose most basic mission is to manage other clients. The WM program decides where to place windows, gives mechanisms for users to control the windows' appearance, position and size, and usually provides "decorations" like window titles, frames and buttons, that give us control over the windows themselves. X window system provides policy but not implementation of that policy. This is handled by window managers. The most basic WM is twm, ad other more fancy ones are fluxbox, Xfce4, etc. These WM have toolkits or widget library. Using Xlib functions directly is pretty tedious, so libraries were developed thru which we could control widgets like menus, buttons, scroll bars, canvas, etc, much more easily. One of the earliest toolkits developed was Motif. However, Motif wasn't free, so alternatives were developed. One such very popular alternative was Gtk (aka GNU tool kit, or GIMP tool kit). Another very popular toolkit used in KDE was Qt.

Desktop Environment (DE):

Since there are hundreds of different toolkits and they all have different looks and behaviour, applications developed using different toolkits will look dihttps://www.computerhope.com/unix/startx.htmff even when lauched on same computer. Even the way of launching pgms vary from one wm to another(i.e on some we may have to type cmd, while some may have gui to launch, etc). DE provides a set of facilities and guidelines aiming to standardizing all this stuff. Under Linux, 2 most popular DE are KDE and GNOME. KDE has kwm as it's WM (which is mandated by KDE), but GNOME allows any WM (although it's preferred WM is sawfish). GNOME uses the Gtk toolkit, and provides a set of higher-level functions and facilities through the gnome-libs set of libraries. The whole idea of a desktop environment is consistency, so all apps developed for one DE, would look and feel the same (i.e scrolling, left/right clicks, etc). However, we are completely free to run apps developed for KDE on GNOME or vice versa, they would just look different compared to other apps.

GUI startup:

Assuming we have GNOME DE, this is what happens on powerup. As explained in Linux installation, after powerup and succesful login, xinit is called for GUI desktops. Even the login i/f may be GUI (instead of text login i/f as found in older versions of Linux) which is managed by Display manager (dm). Here, dm starts the X server. When a Linux system starts X, the X server comes up and initializes the graphic device, waiting for requests from clients. First a program called gnome-session starts, and sets up the working session. A session includes things such as applications I always open, their on-screen positions, and such. Next, the panel gets started. The panel appears at the bottom (usually) and it's sort of a dashboard for the windowing environment. It will let us launch programs, see which ones are running, and otherwise control the working environment. Next, the window manager comes up. Since we're using GNOME, it could be any of several different window managers, but in this case we'll assume we're running Sawfish. Finally, the file manager comes up (gmc or Nautilus). The file manager handles presentation of the desktop icons (the ones that appear directly on the desktop). At this point my GNOME environment is ready to work.

Files:

X server executable is at /usr/bin/X. This is the original X server executable from MIT, but due to license issues, Xorg was forked from X, and it's executable is at /usr/bin/Xorg. Xorg is what is run as X Window system server executable, X is just for compatibility, and it eventually passes cmds to Xorg. (on my centOS system, X is just a softlink to Xorg). However, we never call Xorg executable directly to start X window system server, instead we call a script "startx".

startx => On a text only terminal, X server doesn't start automatically. The wrapper script "startx" can be typed on the text terminal to start X server. The startx script is a front end to xinit that provides a somewhat nicer user interface for running a single session of the X Window System. More details here: https://www.computerhope.com/unix/startx.htm

 

DISPLAY environment variable:

X server and X client need not be running on same computer. They can be on two computers. X client interacts with X server using TCP protocol via Port 6000 or higher, when they are on diff m/c. An env variable "DISPLAY" is used by X client to to find out where the X display server is (so that the client can interact with he user) and which screen it should use by default (on displays with multiple monitors). X client just sends it's display data to X server, and then X server displays it based on what display is connected to it, and what display the X client has asked it to use.

X server manages the display. A display consists (simplified) of:

  • a keyboard,
  • a mouse
  • and a screen.

DISPLAY environment var indicates to graphical clients which display server (I/O devices) to connect to. It can be set to aanything desired by user.

The display var is of the form

hostname:D.S

where:

hostname is the name of the computer where the X display server runs. An omitted hostname means the localhost. So, if we provide a name/addr here, that means X server is located at this IP addr.

D is a Display number (usually 0). It can be varied if there are multiple displays connected to one computer. TCP port used for Display 0 is 6000, Display 1 is 6001 and so on. The phrase "display" is usually used to refer to collection of monitors that share a common keyboard and pointer. Most workstations tend to only have one keyboard, and therefore, only one display with display number 0. Display number is required. When we connect via vnc to a remote m/c, it allows us to open connections to separate multiple displays as mc1.abc.com:0, mc1.abc.com:1, and so on. So, if 5 users are going to connect to a m/c via vnc, vnc can open connections to same m/c with 5 different displays, so that all 5 users can have their own display, even though all their pgms are running on same m/c on same cpu. These are virtual displays and Linux gui is displayed on each of these virtual displays independently.

S is the screen number. A display can actually have multiple screens. The phrase "screen" refers to physical monitors that share the same keyboard and mouse. So, on a system with 2 monitors, screen can take 2 values 0 or 1. Usually there's only one screen though where 0 is the default.

ex: echo $DISPLAY => returns :0 on my m/c by default. This means display is set to localhost, display 0, and screen 0 (since screen is omitted, it defaults to 0). Any X client, before connecting to X server, reads this var, and figures out where X display server is, that it's supposed to connect to. Here, it says it is connecting to X server on local computer. It connects to X server on local m/c, and asks it to display the data it's sending to display number 0.

We can set DISPLAY var to any value we want, and in that shell, X clients launched will talk to X server indicated by the var.

In BASH shell, we set any env var using export:

ex: export DISPLAY=abc.com:0.0 => This makes any X client connect to X server on abc.com. This communication b/w X client and X server will be made over TCP on port 6000. Running a X client as "firefox" will connect to X server on computer named abc.com. Display and Screen used are 0. So, firefox window will end up showing on display connected to X server on abc.com. However, most of the times, computer abc.com won't allow just any computer to connect to it's X display server. So, we'll get an error as "Error: cannot open display: abc.com:0.0".

In Csh, we set any env var using setenv:

ex: setenv DISPLAY abc.com:0.0

Many Xclients allow them to specify Xserver to connect to via cmdline.

ex: xterm -display :0 => this opens new xterm window on local X server, display 0. Basically, xterm opens locally on same display as where you ran this cmd.

ex: xterm -display :1 => This will error out: "Can't open display: :1". This is because there is no display 1

Virtual display:

The question that comes to mind is how do we get multiple displays on a given computer. These multiple displays are called "virtual display".. Any Linux OS has 6 virtual display from 0 to 5. It's explained below under tty section,

TTY:

If we set multi-user.target (look in systemctl section for more details) as default target, then system will start in CLI. (cmd line i/f) Any Linux system starts multiple TTY sessions at boot up. Each TTY session can be thought of a copy of OS running. Most Linux OS install 7 TTY session by default. CentOS has 6 tty, while Linux Mint has 7. TTY stands for  Tele Type, an old piece of equipment which allowed typing, which would appear on typed paper and then on pressing <ENTER>that was sent to computer and then the response would be typed back on the same piece of paper. On modern computers, cmd environment (whether CLI or GUI,) is called a TTY or specifically a PTTY (Pseudo TTY). A TTY session is the environment you are in while interacting with your computer. We use mouse, keyboard to enter inputs, but we can use virtual terminals (i.e "xterm" etc) to enter inputs (which eventually get their inputs from mouse/keyboard) and send it to processes that need it. That psuedo terminal that sends cmds to processes constitutes a TTY session.

So, these 7 (or 6 for some distro) different TTY session can be used to log into by 7 different users at same time. They are useful during debug too. These 7 TTY sessions are rep by device special files /dev/tty1 thru /dev/tty7. Of these 7 sessions (or 6 in CentOS), all but one start in CLI mode. In Linux Mint, the 7th session is the default session that is presented to the user as login screen. It can be CLI or GUI depending on whether we are in multi-user target or graphical target. These 7 TTY sessions can be entered by pressing Ctrl + Alt + F1 (for 1st tty) to Ctrl + Alt + F7 (for 7th tty). In CentOS, the first tty session (tty1) is the default session, and it has only 6 tty sessions from tty1 thru tty6. On any Linux distro, one of these many tty will be the default terminal (GUI or CLI), while remaining ones will be CLI terminal. Ctrl + Alt +F1 takes to tty1 session, while Ctrl + Alt + F7 takes to tty7 session. Which of these is default depends on OS distro.

We login to default CLI/GUI TTY session. However, we can login to other CLI TTY sessions too. We just have to press "Ctrl+Alt+F2" to login to 2nd tty session from the current session. Similarly for other sessions. In these 5 or 6 CLI TTY sessions, we  login by providing username and passowrd. Then we can use it the same way as would use any shell terminal from within GUI (the same way as we use xterm within GUI). We can logout by typing "exit". From these 5 or 6 CLI terminals, we can enter into GUI windows by typing "startx". We can get out of GUI mode by logging out from GUI mode. That will bring us back to Command line.

tty => this cmd prints the current terminal session.  i.e /dev/tty4 etc for 4th CLI terminal (most likely the one on Ctrl + Alt +F4). For GUI sessions, it shows /dev/pts/1, /dev/pts2, etc for multiple terminals opened within GUI. pts means psuedo terminal. Whatever CLI this gui session started from shows the dev/tty number when typed on cmd line before GUI gets invoked.

Unix supports 'device files', which aren't really files at all, but file-like access points to hardware devices. A 'character' device is one which is interfaced byte-by-byte (as opposed to buffered IO). /dev/tty is a character device file, it doesn't contain anything but can be read from and written to.

 

 

 

 

 

In the startup sequence discussed before, we saw that init was the first process to start. However, on newer linux versions, init has been replaced by systemd (system mgmt daemon, d represents daemon) as the first process to start. Init was based on old SysVInit, while systemd is based on newer SystemD, This is a very good link explaining it:

https://www.tecmint.com/systemd-replaces-init-in-linux/

/bin/systemd is run instead of /sbin/init as the process with PID=1. Purpose is the same. A systemd, may refer to all the packages, utilities and libraries around daemon, instaead of just the daemon. However, on my Linux Mint 18, and other linux distro, I still see init as the first process to start, even though systemd is enabled.

Init and systemd have different ways of calling on a service (such as apache2, ssh, cron, etc). This link, explains diff b/w diff methods: https://askubuntu.com/questions/911525/difference-between-systemctl-init-d-and-service

There are 3 different methods:

1. SysVInit using init.d: Under init method, init scripts would be written for any service, and placed in /etc/init.d dir. Then the script in init.d dir would be called directly with options to start/stop/restart/etc for that service. This is not used anymore on new Linux distro, so no need to learn this anymore. It's mentioned briefly below, in case you encounter this on some systems (as Ubuntu).

Selected scripts in /etc/init.d dir are run at startup automatically. Whether the scripts in init.d dir should be run or not is decided by symbolic links to these scripts stored in various dir in /etc/rc*.d/ (rcS.d, rc0.d, ... rc6.d). dir rcS.d has all symlinks in it starting with S where "S" indicate that the service should be started. rc0.d has symlinks starting with S or K where "K" indicate that the service should be stopped (killed). These scripts in rc0.d are executed for runlevel0. Similarly rc1.d is for runlevel1 and so on. The default runlevel for Ubuntu is 2. 

In CentOS 6, this is what I see in dir: /etc/rc.d/rc2.d/

S10network -> ../init.d/network
K50netconsole -> ../init.d/netconsole

This indicats that "network" pgm would be started at startup, while "netconsole" won't.

If we want to add a new pgm to start at startup, we don't modify the links directly, but run "update-rc.d" to update the links to enable/disable pgm.

ex: sudo update-rc.d <pgm1> enable => This updates symlinks to remove "K" symlinks and replaces them with "S" symlinks for program "pgm1". So, now pgm1 would start by default at startup

ex: sudo update-rc.d <pgm1> disable => This updates symlinks to remove "S" symlinks and replaces them with "K" symlinks for program "pgm1". So, now pgm1 would not start by default at startup

NOTE: On most new installation of any Linux distro, you won't see any files in /etc/init.d/ as those files have moved to systemd. On CentOS 6, I don't see any pgm in /etc/init.d, while on Linux Mint 19 (Tara), I see about 50 or so pgms as apache2, bluetooth, sql, etc in that dir. This indicates that many pgms still install files in /etc/init.d/ to control their startup behaviour. In CentOS 6, I do see a README in /etc/init.d/README stating that all the files have moved. However, do note that on many linux distro, traditional init scripts continue to function on a systemd system (even though all the files have moved to systemd). An init script /etc/rc.d/init.d/foobar is implicitly mapped into a service unit foobar.service during system initialization. This is to maintain compatibility, so that existing scripts or any new scripts in /etc/init.d/ don't suddenly stop working.

ex: /etc/init.d/apache2 status => Here, apache2 process init script is called to check status of apache2

2. SysVinit using service: Later on, service cmd was used for SysVinit based systems. It was intended to provide smoother transition into system dependency handling. In most cases,  this newer service cmd just linked scripts in /etc/init.d dir, while in some cases, it called scripts located in entirely diff dir. However, this method is not supported by all Linux distro, and most newer programs, don't even provide support for service. It's mostly there for legacy purpose, and should be avoided all together.

ex: sudo service apach2 status => may work on some distro

3. SystemD using systemctl: This is the newer method of controlling services. Earlier Linux OS used init system, whose fundamental purpose was to init the components that must be started after Linux kernel is booted, and then manage services and daemons. Now, most Linux distro have switched to using systemd, instead of init. However systemd also does same tasks as what "init" used to do. Moving forward, this is the only method that will be supported on all Linux systems. All others will be deprecated. So, use this as preferred method on newer Linux distro, that support it.

link with syntax of systemctl: https://www.digitalocean.com/community/tutorials/how-to-use-systemctl-to-manage-systemd-services-and-units

systemctl => cmd to control systemd "system and services manager". It can be used to enable, start, stop, check status of system services as sshd, httpd, etc.

/etc/systemd/ => contains all systemd config files. systemd looks in /etc/systemd/system/<some_target>.target.wants file for autostart of any systemd application. We see such files for applications as basic, graphical, bluetooth, getty, network, sysinit, multi-user, etc. Each target is dependent on bunch of services that it runs. symbolic links for *.service files are created here from below dir, whenever we want to autostart any service at boot.

/usr/lib/systemd/system/*.service => This has service file for applications as gdm, bluetooth, NetworkManager, etc, target, wants, etc.

/lib/systemd/system/ => has all units as service, target, wants, etc. This is basically replica of above dir. This is where systemctl looks for, so whatever program is in this dir, is what is supported by systemctl ??

ex: sudo systemctl status apache2.service => This works on all systems supporting systemd. Here, it looks for file "/lib/systemd/system/apache2.service"

ex: sudo systemctl status apache2 =>Here, we omitted the extension, .service, but it still works, as by default, systemctl cmd looks for .service file.

 Some systemd cmds for xinit/graphical services are explained in "Linux Installation".

 

 

 

USB: widely used communication protocol. At it's core, it has a pair of differential pins for data xfer. There is no clk pin, as clk is embedded within data, and is recovered from there. USB is a confusing protocol with many different terminology and loosely used terms. It defines 2 kinds of spec. One relates to the data communication protocol over wires (i.e USB 1.0, etc), while other refers to the physical connectors (i.e Type C, etc). On top of data communication protocol, USB also defines power delivery communication protocol (PD 1.0, etc). Not all data/power communication protocol are valid over all connector types. On many occasions, the 3 specs are mixed together. We'll talk about these separately.

History:

  • USB 1.0: Very earliest usb, came out in 1996. It is called LS (low speed) USB. It has speeds of 1.5Mbits/sec (or equivalently a clk with freq of 1.5MHz). It's power src is 5V @ 100mA (=0.5W max power)
    • USB 1.1: incremental update to USB 1.0, came out in 1998. It is called FS (full speed) USB. It has speed of 12Mbits/sec.
  • USB 2.0: next gen of USB, came out in 2001. It is called HS (high speed), and has speeds of 480Mbits/sec (gigantic leap). Power src was also improved to allow more power to be delivered at 2.5W = 5V @ 500mA.
  • Upto USB 2.0, usb connectors were identical. They were either Type A connectors, or Type B connectors. They had 4 pins = VBUS, GND, D+, D-. The data pins D+/D- were used for both transmitting and receiving data, so they were half duplex pins.
  • USB 3.0: 3rd gen of USB, came out in 2008. It is called SS (super speed) USB. It had 10X higher speeds of 5Gbits/sec (equiv to 5GHz clk freq). However, such high clk freq of 5GHz was not neededto provide 5GBits/sec Bandwidth. Instead USB 3.0 std added new wiring for full duplex operation, i.e new pins do not need to be bidirectional, one pair was receiver, while other pair was transmitter. To have backward compatibility, D+/D- pins were left untouched as bidirectional. New TX+/TX- and RX+/RX- pins were added. New USB 3.x connectors were made, which were still backward compatible with USB connectors from prev gen. New USB 3.x connectors just added new superspeed pins for faster data transfer, either by changing shapes of type A, type B connectors, or getting new pins in b/w the existing pins w/o chnaging shapes. All std connectors as type A, type B in all forms were supported. It supported 8B/10B encoding. There were new type of connectors called Type-C (explained below) which exclusively carried USB 3.2 signals.
    • USB 3.1: incremental update to USB 3.0, came out in 2013. It is called SS+ (super speed plus) USB. It doubled the speeds to 10Gbits/sec. It was called USB 3.1 and replaced the older USB 3.0 std. However, to distinguish b/w the older 3.0 vs the new 3.1 std, the older USB 3.0 was renamed as USB 3.1 Gen 1 (with per lane speeds of 5Gbits/sec). The newer std of USB 3.1 was renamed as USB 3.1 Gen2 (with per lane speeds of 10Gbits/sec). USB 3.1 Gen2 still used 8B/10B encoding (same as the original USB 3.0 spec). USB 3.1 spec was backward compatible with USB 3.0 and USB 2.0. From now on, there was USB 3.1 spec and USB 3.0 became obselete (USB 3.0 got rebranded as USB 3.1 gen1).
    • USB 3.2: incremental update to USB 3.1, came out in 2017. It doubled the speeds to 20Gbits/sec, while still preserving SS and SS+ speeds from prev gen. It was called USB 3.2 and replaced the older USB 3.1 std.  In order to double the speeds, it used 2 lanes for data xfer, instead of single lanes that were used until now. To support dual lanes, it required new type of connectors, called Type C connectors. These Type C connectors introduced the concept of Alternate mode (explained later). In order to support all prev gen, it's naming got more complex. Below naming is what is used. x1 or x2 implies single or dual lane. Gen 1 implies lower speed per lane, while Successive Gen imply higher speed per lane. Gen 1 has speed of 5bps/Lane, Gen2 has 10gbps/Lane, Gen3 has 20gbps/Lane while Gen4 has 40Gbps.Lane.
      • USB 3.2 Gen 1x1 = this is USB 3.1 Gen1 (aka USB 3.0, SS). Still supported single lanes (x1), so could use older connectors. Speed = 5Gbps/Lane, Lanes=1, Net speed or throughput =5Gbps.
      • USB 3.2 Gen 2x1 = this is USB 3.1 Gen2 (aka SS+). Still supported single lanes (x1), so could use older connectors. Speed = 10Gbps/Lane, Lanes=1, Net speed or throughput =10Gbps.
      • USB 3.2 Gen 1x2 = This is proper USB 3.2, with lower speeds. This has same speed as USB 3.1 Gen2 (SS+) of 10Gbits/sec, but has higher throughput because of more efficient coding of 128B/132B. It supplies data over 2 lanes (x2) with SS speeds, so kind of similar to USB 3.1 Gen1, but repeated twice (effective clk freq = 5Ghz). It needed newer type C connectors, to support dual lanes. Speed = 5Gbps/Lane, Lanes=2, So Net Speed or throughput =10Gbps.
      • USB 3.2 Gen 2x2 = This is proper USB 3.2, with higher speeds. This has higher speeds of 20Gbits/sec. It supplies data over 2 lanes with SS+ speeds, so kind of similar to USB 3.1 Gen2, but repeated twice (effective clk freq = 10Ghz). It needed newer type C connectors, to support dual lanes. USB 3.2 Gen 1x2 and Gen 2x2 is what we commonly see in today's devices which have USB PD connectors. These use both lanes (lane0 and lane1) for data xfer. Speed = 10Gbps/Lane, Lanes=2, So Net Speed or throughput =20Gbps.
  • USB4 (or USB 4.0): This new std announced in 2019, replaces all earlier std. It doesn't have a space b/w USB and 4 in it's name (it's written as USB4 and NOT "USB 4". However, most online docs and specs still write it as USB 4 which is incorrect), and is not expected to have further iterations as 4.1, 4.2, etc. It works only on Type C connector (with USB PD running too). It supports speeds from 20Gbps to 120 Gbps. It also supports more AltMode xfer than USB 3.2 (ThunderBolt3 as well as Display port). USB4 by itself does not provide any generic data transfer mechanism as USB 3.x, but serves mostly as a way to tunnel other protocols like USB 3.2, DisplayPort, and optionally PCI Express. The biggest advantage of USB4 is that it allows to share bandwidth b/w video and data.
    • USB4 Version 1: This is version 1.0 that was released on 2019. It has double the max speed of USB 3.2 with speeds up to 40Gbits/sec (20Gbits/sec per lane). It initially supported altMode DP 1.4 only, but in 2020, spec provided support for altMode DP 2.0 too. DisplayPort 2.0 can support 8K resolution at 60 Hz with HDR10 color and can use up to 80 Gbit/s which is same amount available to USB data, but just unidirectional. It's named as Gen2 for speeds similar to USB 3.2 and Gen3 for higher speeds. Below are USB4 Gen2 details:
      • USB4 Gen 2x1 = This has same speed as USB 3.2 Gen 2x1. It has single lane at 10Gbps. Speed = 10Gbps/Lane, Lanes=1, Net speed or throughput =10Gbps.
      • USB4 Gen 2x2 = This has same speed as USB 3.2 Gen 2x2. It has dual lanes at 10Gbps per lane for a total of 20Gbps. This is marketed as "USB4 20Gbps" or "20" written on peripheral instead of "SS 10" or "SS 20" that was written for USB3. Speed = 10Gbps/Lane, Lanes=2, So Net Speed or throughput =20Gbps.
      • USB4 Gen 3x1 = This is real USB4 where speed for single lane is doubled to 20Gbps. It has single lane at 20Gbps. Speed = 20Gbps/Lane, Lanes=1, Net speed or throughput =20Gbps.
      • USB4 Gen 3x2 = This has Gen 3 with dual lanes at 20Gbps per lane for a total of 40Gbps. This is marketed as "USB4 40Gbps" or "40" written on peripheral instead of "SS 10" or "SS 20" that was written for USB3. Speed = 20Gbps/Lane, Lanes=2, Net speed or throughput =40Gbps.
    • USB4 Version 2: This is Version 2 that was released in 2022. It doubled the speed to 80Gbps (40 Gbps per lane). It's named as Gen 4. It works in both Symmetric and Asymmetric mode where number of TX lanes doesn't need to be same as number of RX lanes. Below are USB4 Gen3 details:
      • USB4 Gen 4x1 = This is Gen 4 where speed for single lane is doubled to 40Gbps. It has single lane at 40Gbps. Speed = 40Gbps/Lane, Lanes=1, Net speed or throughput =40Gbps.
      • USB4 Gen 4x2 = This is Gen 4 with dual lanes at 40Gbps per lane for a total of 80Gbps. This is marketed as "USB4 80Gbps" or "80" written on peripheral? Speed = 40Gbps/Lane, Lanes=2, Net speed or throughput =80Gbps.
      • USB4 Gen 4x3 = This is Gen 4 with triple lanes at 40Gbps per lane for a total of 120Gbps. This works in Asymmetric mode, where 3 lanes are used for TX and 1 lane for RX or vice versa. Speed = 40Gbps/Lane, Lanes=3, Net speed or throughput =120Gbps.

 

USB connectors:

USB connectors also come in different types (for each type, they may come in different sizes):

1. Type A: This has standard 4 pins. It plugs into downstream facing ports (DFP), i.e host is the source. It came in one size called Standard USB connector. This is the most commonly used USB connector that is seen in USB drives, laptops, etc. Later in 2000, a mini usb connector was released, that was about half the width of standard connector. It was used in camera, Mp3 players, etc, but has been phased out since 2018. A successor to Micro USB is mini usb, which is even smaller and more flatter. This is the most compact form of usb connector, and is seen in phones, tablets, etc. which was flatterIt comes in 2 sizes: Standard and Micro.

Pins:    GND   D+   D-   VBUS => 4 pins 

2. Type B: This also has standard 4 pins. It plugs into upstream facing ports (UFP), i.e device is the sink. Here, the connector size is different. It's square shaped, and is mostly seen in printers.

3. Type C: This is totally new connector with 24 pins, is reversible and is round shaped. As the whole USB arch changed with type C, these required new connectors called as Type C connectors. Unlike Type A and Type B connectors, which came in diff sizes, Type C connectors come in only 1 size. Starting with proper USB 3.2, we need type C connectors to get higher speeds.

 

Pins side 1:    GND        TX1+  TX1-        VBUS           CC1    D+  D-   SBU1          VBUS            RX2-  RX2+         GND  => 12 pins

Pins side 2:    GND        RX1+  RX1-        VBUS          SBU2   D-   D+   CC2           VBUS            TX2-  TX2+         GND  => 12 pins

(NOTE: how pins on lower side are arranged symmetrical to upper side, so that the connector can be reversible)

 

 

  • D+/D- pins are retained to provide USB 2.0 functionality
  • 2 lanes of TX1/RX1 and TX2/RX2 provide dual lane capability to do high speed data xfer for USB 3.0 and above (GHz speeds)
  • SBU1/SBU2 pins are for providing side band signals.
  • CC1/CC2 pins are for providing configuration channel signals (used for PD protocol explained below)

 

USB Protocol:

USB 2.0 and earlier had only 4 pins, with only D+ and D- for data line. Basically data was the only signal sent out, with no associated clock. Data was sent with enough transitions so that a clock could be recovered out of it. This makes it simple to understand the transactions. USB3 and later become overly complicated, and not so easy to understand transactions.

This video connects an oscilloscope to the 2 usb data wires (D+ and D-) going from keyboard to PC, and examines the traffic, and figures out all transactions by manually drawing them on a piece of paper. Excellent video if you want to understand basic packet communication for USB2.

How does a USB keyboard work (by Ben Eater) => https://www.youtube.com/watch?v=wdgULBpRoXk

 

Power Delivery (PD):

With type C came a whole new functionality for power delivery. Until now, USB was primarily for data xfer. However with advent of portable devices that did not need much power to operate, the power pins on such portable devices were omitted. Instead, portable devices started drawing their power from 2 USB pins => VBUS and GND. This was OK in 1990s, as power delivery of 0.5W in USB 1.0 from USB host was sufficient to power these small devices. However, power requirement of these portable devices started going up in early 2000. To accommodate that, USB standards added power delivering capability of up to 2.5W in USB 2.0.

Then battery charging protocol was released in 2010 for USB 2.0 known as BC 1.1. This allowed up to 7.5W power from USB ports. What power to deliver was based on Resistance values on D+/D- pins. Refined BC 1.2 spec was released later. BC protocols are very complex and confusing. We'll not go into more details on those.

PD 1.0:

However, with power requirements for portable devices further increasing, a new spec for power delivery known as PD 1.0 was released. This allowed >7.5W power to be delivered over USB power pins. It was supported over Type A and Type B connectors.  Devices can request higher power from host. Voltages of 5V, 12V and 20V on VBUS were now supported, with currents up to 5A, resulting in 100W max power delivery. However, for V>5V, and I>3A, dedicated power communication pins as CC1/CC2 were needed, which were part of USB Type C spec.

PD 2.0:

With the release of USB 3.1 spec, PD 2.0 was released as part of this spec. It allowed power delivery to happen over USB Type C connectors/cable with special dedicated pin for power delivery communication. These new pins for power related communication were CC1/CC2 pins. Power was delivered over 4 VBUS pins. However for backward compatibility, previous BC1.2 and PD1.0 were still supported over D+/D- pins.

PD 2.0 is :

- single wire protocol on CC wires

- DFP is bus master and initiates all communication

- All msg are 32 bit 4B/5B encoded. It uses BMC (Biphase Mark Coded) encoding, which is a version of Manchester coding.

- It has a Baud rate of 300K (i.e clk rate of 300KHz), so pretty low frequency (wich is OK, since power requirements do not change frequently, i.e in uS)

- It supports CRC error detection + message retries

Out of the 2 configuration channels, CC1/CC2, one of them is used for cable orientation detection, while the other one is used for PD purpose. CC wire for PD communication needs to have valid resistances tied to Power/Gnd. That is how the USB logic figures out which of CC1 or CC2 to use for PD purpose.

CC wire for PD should have 5Kohm resistance tied to GND for UFP (sink device such as mouse), and 10Kohm to 55Kohm resistance tied to VBUS for DFP (host device such as PC). These resistance serve as a voltage divider determining the final voltage level of CC wire. This voltage determines initial current level supported by USB.

Rd = pull down resistance = 5 Kohm

Rp = pull up resistance = 3 different values:

  •  56 Kohm = default = 0.5A @ 5V for USB 2 and 0.9A @ 5V for USB 3. Voltage level on CC wire = 5/(56+5) * 5V = 0.4V
  • 22 Kohm = 1.5A @ 5V. Voltage level on CC wire = 5/(22+5) * 5V = 1.0V
  • 10 Kohm = 3.0A @ 5V. Voltage level on CC wire = 5/(10+5) * 5V = 1.65V

Any voltage > 1.65V on CC wire is taken as no connection, as then the pull down resistance is assumed to be much larger than 5 Kohms (ideally Rd is infinity, and voltage detected on CC wire would be 5V)

Successful attachment of connector would be indicated by presence of a valid voltage on one of the CC wires. that wire would be used for PD communication. Once a default power supply is provided based on resistance values, further PD communication can now happen on CC wire. This PD communication happens via messages sent b/w src and sink. Voltage level on VBUS and higher currents can only be supported after successful negotiation.

PD controller: There is a separate chip called "PD controller" chip that is provided by major vendors as TI, STM, etc that provides all the functionality to deal with PD communication. It takes in D+/D- lines as well as CC1/CC2 lines, and does all power negotiation work to put correct voltage and current on VBUS. High speed data lines go thru another chip which is the main USB data controller chip. D+/D- lines alo go to this main controller chip, as they carry data too. The reason D+/D- go to PD controller chip is to determine power delivery for USB 2.0 (they aren't used for data communication purpose at all in the PD controller chip).

Alternate Mode:

A very useful mode in USB Type C connectors is Alt Mode. Here, it allowed these connectors to support not only native USB mode data xfer, but also alternate mode data xfer (aka AltMode) for other protocols (as Display Port and HDMI protocols) on the same wires. 

Alternate modes are defined and configured via USB PD protocol. Same CC1/CC2 lines are used for AltMode configuration. Structured VDM (vendor defined messages) are used on CC lines to enter AltMode. Unused pins on USB-C are used for Alt Mode of operation.

Display Port (DP) AltMode: This is one of the Alt Mode supported on USB-C. DP protocol requires 4 lanes (8 pins, since differential signals) for data xfer, 1 lane as auxillary channel (2 pins, since differential signals), and 1 wire as HPD (hot plug detect = this line is used by upstream device to detect plugging of downstream device).

DP Alt Mode allows TX1/RX1 and TX2/RX2 lanes to be used as Data lanes for DP, and SBU1/SBU2 to be used as auxillary lanes (since auxillary lanes do not have high speed requirement, these 2 lines can be used as differential lines), and HPD line is provided by the USB PD controller. There is a mux that is controlled by the PD controller. This mux takes data from USB lines, and routes them to either USB contoller (for regular mode) or to Display devices (for AltMode). The 4 data lanes are all unidirectional as display data only goes from source to sink (to be consumed), so all 4 lanes are configured as TX lines. DP is called as simplex protocol (unidirectional xfer), while USB protocol is duplex protocol (bidir xfer).

ThunderBolt3 AltMode: This is another AltMode supported on USB-C for USB4 protocol.

USB4 AltMode: This is yet another AltMode

MultiFunction AltMode: A combination of regular mode on one of the lanes and AltMode on another, i.e USB3.2 + DP