History of Simulators

Verilog-XL (from Gateway design) was the 1st and only verilog simulator available for signoff in early 1990's. Cadence bought it, but ended support at Verilog-1995. It developed it's own compiled code simulator (NcVerilog). Docs from cadence still refer to Verilog-XL when talking about Nc-Verilog. Modern version of NcSim family is IES and recommended for newer projects. However, as of 2018, IES is replaced by  even newer simulator Xcelium. VCS (Verilog Compiled code simulator, 1st SystemVerilog simulator) from Synopsys and ModelSim (ModelTech simulator, 1st VHDL simulator) from Mentor Graphics are the other two qualified for ASIC signoff. All 3 support V2001, VHDL-2002 and SV2005. Modelsim is implemented based on interpreter, so it's much slower compared to VCS and NC-verilog which are based on compilers.

Cadence Simulator: Incisiv Enterprise Simulator (IES) 9.2, verilog-XL (ncverilog) 9.2 from Cadence is the latest simulator (as of 2019). Now as of 2021 Xcelium from Cadence is widely used.

Cadence IES simulator:

Cadence Incisive sim (IES) is based on cadence's interleaved native compiled code arch (INCA is extension of native complied code arch (NCA). With INCA, we can verify multiple languages (verilog, VHDL, SV,  Specman, SystemC, Verilog AMS, VHDL AMS, C , C++, SPICE files, etc), multiple levels (behavioral, rtl, gates), multiple paradigms (event driven, cycle based), mixed signals (digital, analog)) which provides high accuracy with accuracy of event driven simulation (found in interpreted and compiled code tech).
In an NCC simulator, a parser produces an intermediate representation of the input source text. This intermediate representation is then processed by a code generator that produces relocatable machine code that runs directly on the host processor. For example, in a Verilog/VHDL configuration, both the Verilog and VHDL compilers are used to generate code for the Verilog and VHDL portions of the design, respectively. During an elaboration process similar to the linking used in computer programming, the Verilog and VHDL code segments are combined into a single code stream. This single executable is then directly executed by the host processor.
For RTL designs, a min of 64Mb is required while for gate simulation of 150K gates, min of 128Mb mem reqd.

Simulator supports IEEE 1364-2001 std for verilog, OVI 2.0, and verilog XL. System Verilog extensions to verilog as defined in IEEE P1800 std also implemented. We use compiler (ncvlog) and than elaborator (ncelab), which are integrated into IES. When we compile and elaborate a design, all internal rep of cells and views reqd by simulator are contained in single file stored in lib dir. Compiler will automatically create a default work library called worklib in a directory called INCA_libs, which is under the current directory. All design units are compiled into this library.

Cadence Xcelium simulator:

Early simulators processed verilog code in single thread, managing a single active queue of events. This serial methods resulted in significant run time. Xcelium simulator is basically same as IES, except that it can be run in single core or multi core configuration. Multi core configuartion can shorten runtime considerably, by breaking dependency on RTL/gate designs into indep parts, and simulate these parts using independent threads on parallel processors. Xcelium partitions design into accelerated (ACC) and non accelerated (NACC) regions. ACC region contains RTL/gate design, which can be run as parallel threads, while NACC region contains behavioural portions such as testbench, behavioural (model) memories, etc which are run by single core engine. This multi core engine compiler is invoked by passing option "-mcebuild". Compiler will automatically create a default work library called worklib in a directory called xcelium.d, which is under the current directory. All design units are compiled into this library, as well as other libs explained later.

Example of simple design, testbench and testcase:

//simple verilog code that will compile and run: tb.v. To run it, use cmd: irun tb.v
module tb();
 int a;
 initial begin
   $display("a=%d",a);
   //$finish; => this not needed as there's only this file with initial, so nothing is running forever
 end
endmodule

//to run a simple module, create a tb, and change signals at module i/p pins using initial block.
// To run it, use cmd: irun tb.v Top_module.v +access+r -timescale 1ns/1ps => access option needed so that waveforms can be dumped.
module tb(); => brackets optional
 int a;
 reg b,c; //reg neded as wire can't be assigned in always blocks
 
 Top_module I_top (.IN1(b), .IN2(c)); //top module connections => preferred way
 //assign Top_module.IN1 = b; assign Top_module.IN1 = c; => instead of instantiating Top_module as in above line, we can also directly connect pins to nets. NOTE: since IN1,IN2 are nets, "always *" won't work, since it needs regs. so, we use assign.
 initial begin //to apply i/p stimuli and to end sim. Usually this whole block is placed in tc_1.v file, so that we can apply diff stimuli for each testcase
   #100 b=1'b1; #200 c=1'b0;
   $display("b=%d, c=%d",b,c);
   $finish; => this should be last stmt as after this stmt, tool exits
 end

//dump waveform in vcd format. To dump fsdb (novas proprietary format, but used by almost all vendors), we need other system task defined later.
 initial begin //to dump vcd files for all modules. Does not matter in which module it's placed, it still dumps for all modules.
   $dumpvars;
   $dumpfile("tmp.vcd");
   $dumpoff;
   #3150us; //dump vcd starting from 3150us
   $dumpon;
   #600us; //end dump at 3750us
   $dumpoff;
end

initial begin //other way to dump
   #1000; //start of dump
   $dumpvars;
   $dumpfile("/sim/ACE/.../tmp.vcd");
   #2000;
   $dumpflush; //end of dump
end

endmodule


Running simulator: 2 ways.

  1. Multi-step: First compile (different compilers for diff src files), then elaborate then run simulator. Here all these steps are run separately. Not recommended.
    1. Compiler: We have different compilers for VHDL and Verilog. ncvhdl is VHDL compiler, while ncvlog is Verilog compiler.
      • ncvhdl cmd: ncvhdl vhdl_src_files => ncvhdl is VHDL compiler. run ncvhdl -help to get other options
        • ex: ncvhdl -V200X -messages -smartorder a.vhd b.vhd => enables V1993 and V2001 features (use -V93 to enable only VHDL 1993 features), print informative msg, and compile in order independent mode
      • ncvlog cmd: ncvlog verilog_src_files => analyzes and compiles verilog src. performs syntax check on HDL design and generates intermediate representation, in lib database file called inca.architecture.lib_version.pak (architecture=lnx86)
    2. Elaborator: Elaborates the design. ncelab is the elaborator provided by Cadence that elaborates the design compiled by compiler above.
      • ncelab cmd:  ncelab top_level_design_unit => elaborator takes lib cell:view of top level as i/p, and constructs design hier, establishes connectivity, and computes the initial values for all of the objects in the design. It creates a m/c code and snapshot where access level is no rd,wrt or connectivity access to simulation objects, That means we won't be able to probe these objects outside of HDL which is OK in regression mode, but we need to set it to rd access in debug mode.
    3. Simulator: Simulates the design using the test case or patterns provided.
      • ncsim cmd: ncsim snapshot_name => The simulator loads the snapshot generated by the elaborator, as well as other objects that the compiler and elaborator generate that are referenced by the snapshot. The simulator may also load HDL source files, script files, and other data files as needed.
        • ex: ncsim -run worklib.top:module => NOTE: Using -gui option with ncsim starts simVision. That brings up Design browser and Console. Then we can run ncsim cmds on the Console.
  2. Single step: Here all the steps from above are run as part of one cmd. This is much more convenient. There are 3 different variants here depending on the simulator that you have from Cadence. Either we use ncverilog or use irun/xrun (irun for IES and xrun for xcelium). NcVerilog is run in single step by using ncverilog on cmd line. irun/xrun is very similar to ncverilog, but in addition to verilog/system verilog, it can also accept vhdl, systemC, AMS, etc. Since irun/xrun run all steps, that have a lot more options each of which are specific to the tool that is being invoked. So, we should refer to those tools (i.e ncelab, xmsim, etc) for the specific options that are supported. irun/xrun are not case dependent (i.e -nolog same as -NoLoG). Also, short version of cmd line options allowed (i.e -nowarn same as -now, various options support varying num of min char required for that option to be recognized in it's short form)
    1. ncverilog: ncverilog does what multi step simulation does by invoking ncvlog, ncelab and ncsim for you. It lets us run NC-verilog simulator exactly the same way that we ran Verilog-XL (verilog-XL was run using cmd "verilog" on cmd line). All cmd line args are same as those of verilog-XL. On top of this, ncverilog also allows us to include ncvlog, ncelab and ncsim options on cmd line in form of + options. It also suppports manymore + options than verilog-XL.
    2. irun: It's for use with IES simulator. specifies all files on single cmd line. In ex below, top.v and sub.v are compiled by ncvlog using option -ieee1364, middle.vhdl is compiled by ncvhdl using option -v93, verify.e is recognized as specman e file and compiled using sn_compile.sh. After compiling all these, ncelab elaborates design using -access +r option (to provide rd access to simulation object, else in vcd/fsdb dump file, we won't see all wires,reg,etc) and generates sim snapshot. ncsim is then invoked with both SimVision (comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc) and Specview gui.
      • ex: irun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v
      • ex: irun a.v b.v top.v tb.v => simplest cmd to run all rtl and tb files
    3. xrun: very similar to irun. It's for use with Xcelium simulator. However, compilers here are xmvlog, xmvhdl, sn_compile.sh. xmelab elaborates design, while xmsim simulates the design (xm means xcelium, while nc meant ncverilog which was used earlier in IES). xrun uses xmsc_run compiler i/f to compile c/c++ files. These compiled files, along with any other object files provided on cmd line, are then linked into single dynamic library, that is then automatically loaded before elaboration of design.
      • ex: xrun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v => NOTE: how all args are same as those of irun

 

Sequence of steps when running the Simulator:


NOTE: Both irun and ncverilog finally run ncsim which runs simulation cmds. Using -gui option brings up SimVision on ncsim cmd prompt. When running irun/ncverilog, this is what appears on screen:


1. ncvlog/ncvhdl: analyzes and compiles each source file. => done only when any file changes, else it's skipped
ex:     file: ../models/CFILTER.v
        module worklib.CFILTER:v
                errors: 0, warnings: 0


2. ncelab: elaborates files and constructs design hier from top level design units. It auto figures out top level design units based on if they are referenced elsewhere. Usually digtop_tb and testcase_name_tc are top level design units as they aren't referenced anywhere else. Then it generates native compiled code for each module and then provides design hier summary.  It finally writes the simulation snapshot, which is a file that has all info for sim to run on it (w/o needing any info from anywhere else). elaboration step is run only when any file changes, else it's skipped
ex:   Elaborating the design hierarchy:
        Top level design units:
                digtop_tb
                S1_main_hunt_tc
        Building instance overlay tables: .................... Done
        Generating native compiled code:
                S1.AFE_AGC_S1:v <0x17bc2126>
                        streams:  28, words: 11022 < and so on for each module ....>
        Building instance specific data structures.   
        Loading native compiled code:     .................... Done
        Design hierarchy summary:   
                             Instances  Unique
                Modules:         1       1     
                Registers:       3       3  
                Initial blocks:  1       1
        Writing initial simulation snapshot: worklib.tb:sv   
Loading snapshot worklib.tb:sv .................... Done        
     
3. ncsim: loads the snapshot generated above and runs ncsim. ncsim prompt appears. It first source ncsimrc file (this file is needed by ncsim for displaying rc files). Then it puts "run" cmd, and then on encountering $finish in any module or on reaching end of all "initial" and having no "always" or other infinite loops, it puts "exit" cmd to exit ncsim.
ex: ncsim> source /apps/cds/incisiv/12.20.018p2/tools/inca/files/ncsimrc => this file aliases run as "." and exit as "quit", so that . will also work instead of run, and quit will also work instead of exit.
    ncsim> run .... (displays stmt which have $display ...)
    ncsim> exit

----------------

NOTE: In verilog-XL(ncverilog) and irun, many cmds in ncvlog, ncelab and ncsim which are preceeded by "-" are replaced by +.
ex: ncvlog -define arg1 => in ncverilog/irun, it's irun +define+arg1

Help:
>irun -helphelp
>irun -helpall

NOTE: to get help on any error that we see on running irun, we can type this:
Ex: error ncelab: *E,CUVRFA: blah ... shows up. To get more info type: nchelp ncelab CUVRFA
Ex: If error happened in ncvlog, type: nchelp ncvlog CUVRFA

 


 

RTL and Gate Simulation setup:

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/
3 subdir:
--------
tb: testbench dir. It has top level tb file (digtop_tb.v). digtop_tb.v defines a top level module digtop_tb, includes file all_tasks.v & xfilter.v, does initial begin .. end, and then instantiates module digtop and calls this dut, and connects all tb_* signals to appropriate digtop pins.

tc: testcase dir. It has test cases for different tests. i.e for interrupt block, it has interrupt_tc.v. Remember, any signal that you specify in tc should be an i/o port of a module or block, as internal net names may get renamed in gate synthesis, so even though the testcase may run on RTL, it'll fail to run on gate netlist.

sims: This is the main dir to run gatesims or RTL sims.

RTL:
-----
Build RTL dir:

run_rtl_sims (verilog) => script to run verilog RTL sims
----------------------
#we need to be able to run debussy to debug, so we provide a link to provided compiled lib from Debussy (if PLI app from Debussy has already been compiled into dynamic shared lib as is the case here) to provide bootstrap dynamic linking. Then user defined bootstrap fn can be accessed using load* (loadpli1, loadvpi, etc) in irun (or Nc simulator). This PLI defines functions such as $fsdbdumpvars and $fsdbdumpfile, which are needed for dumping fsdb files (note functions for vcd dump don't require this PLI, since they are supported by default by all simulators).
#for linux OS
set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
#for SOLARIS OS
#set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v20/share/PLI/nc_xl/SOL2/xl_shared/libpli.so:deb_PLIPtr"

irun -9.20.039-ius \ => specifying version of irun is optional. default is chosen based on ame if nothing specified. (running "irun -version" returns the version of irun being used)
$DEBUSSY_PLI \ => loads debussy PLI
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \ => -y for dir. All gate verilog included incase we've any stdcells instantiated in RTL (usually clk gaters and mux/logic on clk/reset are hard instantiated)
#+incdir+../../tb/ \ => incdir option is used when we have `include "file1" in some other verilog file2. Then we have to include whole dir where file1 resides, else while compiling file2, we'll get an error about file1 not found. We don't need to compile file1 as `include will cause file1 contents to be included in file2. Note that if we try to compile file1, it may not compile as any verilog file to be compiled needs to have proper syntax (i.e file should have "module", "endmodule", etc. Many times in such include files we just have some verilog stmts, which is fine as these are just inluded in main file2 which already has module etc).
-y /db/Hawkeye/design1p0/HDL/Source/golden \ => instead of this, we could also use "-f rtl_files.f" which would have paths for each RTL file to be included
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-coverage ALL -covdut digtop -covoverwrite -covworkdir ./coverage/cov_$1 => puts coverage results in dir "/coverag/cov_$1/". says top level dut used for coverage should be "digtop" instance (we can also limit coverage to particular sub-module by using hier path for that instance(Not defn of module but instance of module)". It generates binary coverage data files (UCD) and coverage model files (UCM). coverage types can be code(block, expr, fsm, toggl) or functional(assertion, covergroup). "all" enables all code coverage types listed (B=>Block, E=> expression, F=>FSM, T=>Toggle, U=> fUnctional, A=>all. ex: we can wrt "-coverage BEFT" to enable all code coverage).
#NOTE: instead of using coverage cmds, we can also pass a .ccf cfg file which can have all cmds in there. i.e -covfile config.ccf. sample coverage.ccf file
select_coverage -all -module * => selects all coverage
set_libcell_scoring => IMP: sometimes we get no coverage results. Reason is coverage stops at libcells. Sometimes all modules treated as libcells whenever irun calls source dir with -y option (-y option is usually used with libcell dir). So, this "set_libcell_scoring" option forces coverage to be reported for all libcells too.

-l ./rtl_logs/$argv[1].log \ => -l (small letter L) is to specify logfile instead of default irun.log. We can also use /$1.log (as $1 and $argv[1] are same)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \ => rd access so that all wires,reg etc can be accessed in vcd/fsdb files
+libext+.v \ => specifies extension of files referenced by -y option (+libext+extension). If this option not used, then files referenced by -y should not have file extension, else they will be ignored (very imp to use this wuth -y)
+licq \
#+sv \ => with -sv option, all verilog type files are compiled as SystemVerilog.
+notimingchecks \ => do not execute timing checks for $setup, $recrem, etc
-input dump.tcl \ => optional. needed for shm db dump. see in simvision section below for more details
+define+TI_functiononly \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/rtl/$argv[1].fsdb\\\" \ => important to have \\\ before "
+define+FSDB \
+define+IMGFILE=\\\"/sim/.../a.img\\\" \ => this can be used in tb.v file or any other verilog file, to assign value from cmdline. i.e
 `ifdef IMGFILE defparam tb.block1.PRELOADFILE=`IMGFILE; `endif
-svseed random \ => assigns random seed to all $urandom fn
+nctimescale+1ns/1ps => default timescale to use if no timescale defined anywhere

#-work: by default, irun compiles all design units in HDL files in work library called worklib (located within INCA_libs dir). We can change work lib name by using -work.
#dir structure is:
INCA_libs/irun.nc/xllibs/models,golden => for models dir, golden dir, etc specified with -y above stored in xllibs
INCA_libs/worklib/.inca*db, inca*pak   => contains all compiled units as one file in .pak lib database. within worklib dir, we have subdir for std,ieee,worklib,synopsys,etc which have their own .pak database.

#-linedebug: to get debugging info

run_rtl_sims (mixed: tb is in verilog but src files are in vhdl/verilog)
----------------------
remains same as above (i.e same as running verilog rtl sims)
The only difference is that novas fsdb dump doesn't work on vhdl src files (i.e it only shows signals for verilog files in waveform, but not for vhdl files). Option is to dump vcd file, as vcd file will always have all signals. Other option is to set DEBUSSY_PLI to newer version of novas (in run_rtl_sims file) as follows: (doesn't seem to work ?)
DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/2010.04/share/PLI/IUS/LINUX/boot/debpli:novas_pli_boot"

run vhdl rtl sims: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/sims/run_rtl_sims
-----------------
#for fsdb dump
In tb/tb_spi.vhd file, put "use WORK.novas.all;" at the top before entity declaration, and also add this directive:
process
begin
`ifdef FSDB
        fsdbDumpvars(0,":");
        fsdbDumpfile("test.fsdb");
`endif
end process;

#above code, always dumps fsdb file as dump.fsdb in current dir. So, we can instead run this to dump into specific file:
#create file nc.do and then call this file from irun cmd line by adding this option: -input nc.do \
call fsdbDumpfile /sim/HAWKEYE_DS/kagrawal/digtop/rtl/SPI.fsdb
call fsdbDumpvars 0 :
run => if we don't add this line, then ncsim stops at cmd prompt, and we have to type run on the prompt to continue

-----------
#run_rtl_sims (vhdl):
#LD_LIBRARAY_PATH needs to be set
#solaris
#setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/SOL2:$LD_LIBRARY_PATH
#linux
setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH

#note, here we specified debussy_pli with path separately defined above, while for verilog, it was all in one line.
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot"
#we may also add -loadcfc option above, to get rid of some system errors:
#set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

#irun (same as for verilog, except -top,relax,V93 options used)
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
#/apps/novas/debussy/2011.01/share/PLI/IUS/LINUX/boot/novas.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_typedefs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_control.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_regs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/tb/tb_spi.vhd \ =>
-top E \ => for vhdl, top entity has to be declared (this top entity is in tb/tb_spi.vhd)
-relax \ => to relax strict vhdl requirements
-V93 \ => since our vhdl is 1993 format
-input nc.do \ => use this, if we call fsdb cmd in nc.do isntead of fsdb cmd in tb_spi.vhd
-l ./rtl_logs/$argv[1].log  ... => other options same as those for verilog

-------
#for vhdl and SystemC files, you have to specify the top level with -top option, as simulator does not automatically calculate top-level VHDL/SystemC design units. However, with this option, autodetection of top level verilog modules is disabled. (-vhdltop and -sctop specifies VHDL top level and Sc top level, but doesn't disable auto calculation of verilog top level units)
-top [lib].cell[:view] => specifies top level unit, can use multiple -top options to specify multiple top-level units
Ex: -top E \ => entity E is defined in top level testbench file tb_spi.vhd, which calls top level source entity spi.

#for vhdl, IEEE 1076 standard does not allow for multiple choices (i.e. 0=>'1', OTHERS=>'0') in an array aggregate that is not locally static (i.e. VECTOR(size-1 downto 0) has a variable range). If you make the range of the array static (e.g. VECTOR(3 downto 0) or provide only one choice (e.g. OTHERS=>'0'), then the code will compile correctly. Cadence has adjusted ncvhdl with a switch named '-relax' which relaxes a variety of LRM rules, and alows code to compile.
-relax \
#we can also use option -V93 to force irun to compile with VHDL93 syntax.

GATE:
----
gate sims run on gate level netlist, which has all nets as "wire". If there's a net which is i/o port of module,  it has to be connected through a "wire" at higher level to another i/o port of some module, or to i/o port of top level module. All these "wire" have parasitics associated with them in spef file, and hence delays associated with them in sdf file. Some nets appaer as "wire", but during optimization, they are not used for connections (like instead of Q pin of flop, QZ pin is used sometimes, which results in net associated with Q pin to be floating). such nets even though listed as "wire" don't have any parasitics and are reported as "unannotated nets" during sdf file generation (in PT).

We do timing checks when running gate sims. This may cause non-convergence in simulator for cases where there are -ve setup/hold times or -ve rec/rem values in sdf file. see in verilog.txt.

--------------------------------
GateSim (for verilog testbench):
--------------------------------
For gatesims, we do xfiltering for meta flops, and we do sdf annotation for all nets/cells. We add this in digtop_tb.v in b/w "module ... endmodule", whenever SDF_MAX or SDF_MIN is defined.
digtop_tb.v:
1A. xfilter: include "../tb/xfilter.v" => In this file, we define Xon parameter for all meta flops to be 0. On doing this, setup/hold check is turned off for this flop, so that we don't see these warnings: "Warning!  Timing violation $setuphold<setup> ( posedge CLKIN:65071 PS, posedge EN:65077 PS,  0.248 : 248 PS,  0.041 : 41 PS );... Time: 6548 PS" for that flop. Here, numbers shown are setup of 248ps(min:typ:max), and hold of 41ps(min:typ:max). When only 2 values shown instead of triplet, that means sdf file had only 2 values. Here CLK and EN comes within 6ps (65077-65071) causing a viol.
ex: defparam testbench.Idigtop.Ideglitch.mota_itrip_deg.sig_meta_reg.Xon = 0; => This Xon parameter = 1 in model of flop (in ifdef TI_verilog section of DTCD2.v flop). So, by default X is propagated, but if we set Xon=0, then X is not propagated. X value in that meta flop is forced to whatever RTL is modeling. That means whatever is the i/p of flop right before the clk edge is passed. If the i/p changes right on the clk edge, then the coding sequence determines which happens first, i/p change or clk edge. If we don't set Xon=0, then X's will get propgated to all logic eventually, and all our test cases will fail. By setting Xon=0, we force o/p of flop to be 0 or 1 always.
 => Next, in filtered_logs dir, we copy all log files from gate_logs dir, and search for any "Warning" msg using filter_warnings.pl script. We should not see any warnings as meta flops are the only ones that should have setup/hold viol. Any other viol is real, and should be fixed in design. Since we were timing clean, we should investigate if we had mistakenly set that path to a false path in PT/ETS.

1B. instead of xfilter.v file, we can also turn off timing check by using tcheck cmd by specifying it on irun cmd. (valid for irun versions 14.2 or later)
    +nctfile+gate.tfile => arg to irun (no space in b/w "+")
   ex: In gate.tfile, we put 1st sync flop for all synchronizers to be filtered out for x propagation. This also prevents tool from generating "Warning! Timing violation $setuphold ...". option 1A above may still generate warnings depending on library model written.
       PATH tb_digtop.dut.sync_*.genblk1_S_sync1 -tcheck => turns off timing check for flop genblk1_S_sync1. Not sure, if it turns off all timing checks or just setup/hold.
   NOTE: if running older version of irun, then the tool doesn't pick up thse tcheck and will throw this warning "ncelab: *W,TFANOTU (gate.tfile) tfile node ... was not used by design". This means tool discarded the tcheck, due to old version, etc, etc.

1C. we can also provide timing check file via "-input tcheck_off.tcl", which will have " tcheck -off" cmd for 1st stage of all sync flops.
ex: tcheck -off veridian_tb...i_sync_flops.u_sync.tiboxv_sync_2s_acn_sync_0

2. sdf_annotation: $sdf_annotate( .... ) for both max/min. see in sdf annotator section below.

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay
----------------------
#same as run_rtl_sims except netlist is gate level, neg_tchk, max_delays, define+TI_verilog used

set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
../../../Source/global.v \
../../../FinalFiles/digtop/digtop_final_route.v \ => gate netlist
../tb/digtop_tb.v \
../tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
+sv \
+neg_tchk \  => allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation. This is needed, bacuse tools zero out -ve timing check numbers, as it may not converge and causes large performance issues. see in verilog.txt for more info on -ve timing checks.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
-input dump_gate.tcl => optional. same format as for rtl sims.
-SDF_CMD_FILE sdf_max.cmd => optional. see sdf section below for details.
+nctfile+ gate.tfile => optional. turns off timing checks for specified gates. see above section for details.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
#+define+VCD \
+define+VCDFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.vcd\\\" \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

NOTE: after running gatesims with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated.

----------------------------
GateSim (for vhdl testbench):
----------------------------
Dir: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay

#sdf compiled file generation (see below in sdf annotation)
ncsdfc /db/MOTGEMINI_DS/design1p0/HDL/FinalFiles/digtop/digtop_max.pt.sdf -output ./digtop_max.pt.sdf.X

setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
/db/Hawkeye/design1p0/HDL/Source/golden/global.v \
/db/Hawkeye/design1p0/HDL/FinalFiles/digtop/digtop_final_route.v \ => gate netlist
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
-input nc_max.do \ => look above in rtl sim for vhdl (it calls fsdb dump functions)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
#+sv \
+neg_tchk \ =>allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
+define+GATE \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

 




Waveform viewer and debugging system:

Many waveform viewer available to view the results of simulation. Some popular ones are as below:

  1. SimVision from Cadence: comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc.  It uses *.shm waveform database to store waveforms. Expensive license ($50K)
  2. Debussy from Novas (purchased by SpringSoft in 2008): The Knowledge-Based Debugging System. debussy is cheaper ($5K), but its superset Verdi is used, which is behaviour based debugger. It uses fsdb and vcd waveform database. All the cmds of Debussy are valid in Verdi. Debussy invoked by typing: debussy -f <cmd_file>. We use debussy, Release 2008.10 , Linux x86_64/64bit. (though it says it's using verdi 2008.10 version with 64 bits).
  3. Verdi from Novas (Verdi was a product of Novas, but was purchased by Synopsys): Verdi is superset of Debussy, costs more but has lot more features. invoked by typing: verdi -f <cmd_file>. Verdi is the recommended tool to use (instead of debussy).

All these waveform viewers need waveform in some format to display it. Two most common waveforms supported are as below.

  1. VCD: (value change dump), ASCII  format for waveform dumpfiles. defined by IEEE std 1364-2001 and supports 6 value VCD format (orig 4 valued logic: 0,1,Z,X and later signal strength and direction added). widely used. The VCD file comprises a header section with date, simulator, and timescale information; a variable definition section; and a value change section, in that order.
  2. FSDB: (fast signal database), which is Novas' proprietary waveform dump format.  It is much more compressed than the standard VCD format generated by most simulators.  Novas provides a set of object files (using +loadpli) that link with all common commercial simulators to generate an FSDB file directly.



SimVision:
--------
Using -gui option with ncsim or irun/ncverilog brings up SimVision.
> irun -gui -f run.f -access RWC -linedebug (add "-uvmlinedebug" if running with uvm)
NOTE: "-access +r" or "-access RWC" is needed, else waveform dump won't show any signals (as they don't have read permission, r=read, w=write, c=connectivity to help with x propagation). Also, ncsim cmds for dumping waveform into cadence database (waves.shm) is needed in input script or on ncsim prompt. See below for details.

We can also directly type simvision to bring up simvision. We can then open "waves.shm" database.
simvision &
simvision -waves waves.shm -input digtop.svcf & => This will open up waves.shm database, with signal file digtop.svcf (similar to rc file in nWave). We can do "File->Source command script" to load svcf file or "save command script" to save svcf file.
 
Simvsion has a Design browser and Console.
1. Design Browser-simvsion: It allows to browse design. It shows modules, RTL, etc. NOTE: If we select signals on this, it won't show up in waveform window automatically. We have to do "send to waveform" to see them on waveform viewer.

2A. waveform-simvsion: To invoke waveform viewer, click on "send to waveform" on design browser of simvision (It's 2nd button after + sign on top right side). Imp cmds:
send to: => used to send values from waveform to RTL or schematic and vice versa
= => this zooms to fit waveform

2B. On waveform, to see delta time delay: take mouse to "yellow pulse shape", hold right click for a second, and a pop up comes. choose "expand time"->All_time. Then on waveform we see blue shaded area. The blue area shows what happens in delta delay time (you will see that time remains same in blue area, but numbers in brackets change implying delta delay)

3. Console-simvision: It's used to run ncsim cmds. It has ncsim prompt on simulator tab (It has 2 tabs on bottom: Simvsion and simulator). When we write "run" cmd on it, that it when it starts running sims. When we are not in Simvision gui mode, then run is automatically placed on ncsim cmd prompt, so that our simulation runs to completion. Then when completed, exit is automatically placed on ncsim cmd prompt to exit sim. If we want to stop sim when in cmd line mode, we can add "-tcl" to cmd line, and then tool will stop at ncsim prompt. We'll have to type "run" on ncsim prompt to continue. ex: irun -tcl -f run.f (stops at ncsim prompt)
Ex of ncsim cmd:
ncsim > database -event -open waves -into waves.shm => create shm database named waves.shm (which contains .dsn and .trn files, which are waveform dump). waves is the scope. "-event" provides zero time events to be seen on any signal, which is otherwise not possible to see. This helps detect edges happening with 0 width)
ncsim > probe -create -all -depth all -tasks -functions -memories -database waves -name probe_a => probe all signals, all depth and for all tasks,functions too. It does not probe memories (2-d,3-d array), so have to put -memories also.(also, if we run gui mode, w/o using -tcl, then memories are automatically added to probe). Put this probe data into database waves. If no name is provided for probe, then ncsim will name it probe 1, probe 2, etc. NOTE: in design browser, select Scope as "waves", and then you will all signals with values. By default, scope is "all available data" which shows simulator scope also (which may not have any probe data).

NOTE:To get extended vcd (which shows port dirn too), do this: (evcd needed to generate tdl files)
ncsim> database -open waves -evcd -into myvcd.vcde
ncsim> probe -create testbench.dut -evcd -database waves
Instead of above 2 cmds, we can also d this in Tb.sv file: initial $dumpports(UVMTb.I_dut, "sim.vcde");

nsicm > run => runs ncsim till it terminates. pgm terminates when $finish is reached in any module.
ncsim > run 2.5 ms => runs ncsim for 2.5ms
ncsim > exit => exits ncsim.
ncsim> reset => resets ncsim, so that we can run simulation again starting from time 0
NOTE: To rerun new rtl after modification, we can either close simvsion and rerun simulation again or from Console window we can click Simulation->Reinvoke Simulator. This reruns new rtl and loads new waveform.

NOTE: we can provide -input option with irun, specifying the input file, which gets loaded on ncsim prompt. This saves up from manually typing the ncsim cmds on cmd line. If we don't provide cmd for "database -open .." or "probe -create ...", then no cadence datanase is created. To create vcd/fsdb database, we have to provide system task "$dumpvars .." within "initial begin ... end" block to dump waveform database.
Ex: irun -access +r -f rtl_files.f -input dump.tcl .... => -access +r is needed to see signals in waveform dump
dump.tcl has these lines:
database -open waves -into /sim/bellatrix/kagrawal/waves.shm -default
probe -create -emptyok -database waves -all -memories -depth 10 digtop_tb => var in function/task not dumped by default. To dump those, use -variables.
probe -create -emptyok -database waves -all           -depth 3  Silver_top.Xosc.I1 => This type of probe used for ams sims to dump voltages upto 3 levels deep
probe -create -emptyok -database waves             -flow -ports Silver_top.Xosc.AVDD => This probes current at AVDD port of Xosc block. valid for ams sims, since digitl blocks (which are modeled as verilog) do not consume any current.
probe -create -emptyok -database waves -all -flow     -depth 3  Silver_top.Xosc.I1 => This probes current for all nets upto 3 levels deep.
probe -create -emptyok -database waves -all -memories -depth 10 -domain digital => This is helpful in ams sims, where we do not need to specify path of digital block. It does probing upto 10 level deep of all nodes which are digital in nature (i.e have verilog models)
run
quit => this is executed after run has finished

----------

Xcelium (xrun):

--------

As discussed earlier, xrun is used to run designs on Xcelium Simulator. It does work similar to irun. All of the options for xrun same as those for irun. 2 imp help cmds for xrun:

> xrun -helpshowsubject => shows list of subjects as xmvlog, xmvhdl, xmelab, xmsim, etc

> xrun -helpsubject xmvlog => shows all options for subject xmvlog, as -assert, -ams, etc

> xrun -helpall -helpalias => -helpall displays list of every supported option, while -helpalias displays different ways to enter an option (ones entered using -/+ signs. irun/xrun use both "-" and "+" for cmd line options)

ex: xrun top.v test.c obj1.so -y ./libs -y ./models -l run1.log ... (source files can be in any format as .v, sv, .vhd, .e, .vams, .c, .cpp, .s, .o, .so, etc)

This is how dir looks like, when you run xrun: ex: 

xcelium.d => instead of INCA_libs, this build dir created. Contents in this dir are automatically checked (timestamp, snashot info, etc) on rerun of xrun, to determine if recompilation or re-elaboration is needed. It has following subdir:

1. xcelium.d/run.<platform>.<xrun_version>.d (ex: xcelium.d/test_sim.lnx8664.19.01.d, instead of run, we created test_sim as custom name by using option -snapshot test_sim ). A soft link names test_sim.d is created by default pointing to this dir. Within this are subdir, is xllibs dir, which has subdir for each -y libraries and -v library files (i.e run.d/xllibs/<libs> and run.d/xllibs/<models> when cmd is "xrun top.v -y ./libs -y ./models ... ")

2. worklib => design files contained in HDL design files (as in top.v) are compiled in this dir. Usih option "-work <worklib_name>" changes name of this worklib dir. Within this dir is library database file called "xlm.lnx8664.066.pak" file, which stores all intermediate objects required by Xcelium core tools. These .pak files are large and so usually compressed by using -zlib option

3. history => There is history file which records all prev cmds run

options:

-64/-64bit => runs 64bit version of xrun

-top chipTb => defines top level module (can have multiple such cmds since there are typically multiple top level modules from uvm, design, etc). This option not needed for v/sv top level modules, but required for vhdl/systemC top level modules. By default, top level design units are automatically determined for v/sv, but are not automatically inferred for vhdl/systemC if top units are in these files. In such cases, this option is required

-l <logfile> => by default, log is written to xrun.log in same dir where xrun was invoked

-v libfile.v => old scheme of lib mgmt. xrun scans this file for module/udp defn that can't be resolved in normal src files specified. -v option causes module/udp in these files to be parsed, only if they have the same name as unresolved module/udp. Otherwise they are not parsed, which saves time. If we omit -v, then these module/udp in these files will always be parsed

-y <lib_dir> => specifies path to library dir, where files containing defn of module/udp are to be found

-define foo=2 => -define similar to using `define compiler directive in verilog. same as irun, can use +define+ also. If there's no value to assign, we can also do "-define foo".

-compile => parse and compile source files, but do not elaborate

-elaborate => parse and compile source files, elaboarte design and generate simulation snapshot but do not simulate. If -compile/-elaborate options not used, then all steps run (compile/elaborate/simulate)

-hal => this runs HAL (HDL analysis) on snapshot instead of running simulator. This is used to verify any errors/warninhgs etc on design files.

-snapshot <snapshot_name> => genrate sim snapshot with given name (-name or -snapshot are both same) in xcelium.d/worklib/<snapshot_name/*. By default, snapshot name are xcelium.d/worklib/run/*. This option also changes name of xcelium.d/run.lnx8664.19.01.d to xcelium.d/<snapshot_name>.lnx8664.19.01.d.

-r <sanpshot_name> => load and simulate specified snapshot, w/o doing any kind of checking. By providing "-input file1.tcl", we can provide diff tcl cmd i/p files to have multiple diff sims with same snapshot. -R (w/o any snapshot name) is used to simulate the last snapshot generated by xrun cmd.

-xmlibdirname <xcelime_dirname> to have custome dir name instead of xcelium.d. When running simulator only (using -r or -R option), we need to provide this, if snapshot is not in default dir path or default name.

-clean => this forces removal of dir xmlibdirname or xcelium.d and start fresh. This causes xrun to recompile, re-elaborate and recreate dir. In absence of this option, automatic checks are done to edtermine if this dir can be reused

-hdlvar /home/.../my_hdl.var => This var file is a configuration file that can have all cmd line options and args in 1 place (i.e DEFINE XRUNOPTS -ieee1364  -access +rw etc) . That way, the regular xrun cmd won't look lengthy and complex

-f <args_file> => We can also provide additional argument file that can have any args in it, name of source file, and everything else needed with xrun, which will be added to xrun existing args (i.e -clean source.v ...)

uvm cmd line options supported by xrun:

-uvm => enable support for uvm

-uvmhome /UVM/.../uvm-1.2 => specifies loc of uvm installation. By default, uvm is installed in <install_dir>/tools/methodology

-uvmexthome .../CDNS-1.2 => loc of cadence extensions to uvm. By default, uvm extensions are installed in <uvmhome>/additions/sv

. run_test() task in top level module calls this test to run+UVM_TESTNAME=<test_name> => specify name of test

 

 


Debussy:
---------
used to see waveform demp, and annotate it to rtl/gate so that debug is easier. It is also used to see schematic rep of rtl or gate, which helps to see connectivity. Gate schematic specially helps during ECO as we don't have to manually go thru verilog text file of digtop_final_route.v.
Debussy has following tools as part of the suite.

nTrace:
-------
gui that comes up to traverse design hier.can trace load, driver, connectivity. can change src code by choosing ur editor: tools->preferences->editor, and then choosing source->edit source file.
to import design, goto file->import design. Select "from file", set Virtual Top as "digtop", default dir as "/db/Hawkeye/.../FinalFiles/digtop", then in bottom LHS panel, goto dir "/db/Hawkeye/.../FinalFiles/digtop", then click on synthesized netlist "digtop_final_route.v" in RHS, and click Add. Then it shows up in design Files. Click OK. Now, you can see whole netlist in the top panel
active annotation: allows to view verification results in context of src code. But before using this, we need to load sim results (in FSDB file) using file->load simulation results. Then in hier browser, double click the instance that you want, choose source->goto->line, enter line number and OK. Then choose source->active annotation (or x key after putting the cursor in source code pane) to activate active annotation. values associated withj each signal are than displayed at time 0. Now we can do serach forward, backward for signals to change time.

nSchema
----------
gui that shows schematic.
Once you have imported design, goto tools->new scematic->current scope. Then schematic is drawn for whatever is selected as current scope in panel (current scope name also shows in the top window bar, it's set as whatever instance is selected, i.e digtop or interrupt etc).
In new schematic window, goto view->high contrast. This turns ON contrast for better viewing.
 
nWave
------
gui that shows waveform viewer:
nWave -ssf test1.fsdb => This loads the fsdb file directly
Load fsdb file: do file->open. then type name of dir containing fsdb file in white box. That shows the dir and files in that dir in two windows below. Select appropriate fsdb file in RHS window. click on Add, and then OK. This load the fsdb file.
get signals: click on "get signals" (next to open file drawing)
important settings:
1. Waveform->Snap cursor to transitions. when this is set, then when we click on any signal waveform, then the cursor goes to the next edge. Useful when doing active annotation in debussy, since the change shows up in rtl signal values.
2. Tools->Preferences. It has almost all settings for GUI. Thses settings remain there even on quitting nWave. Goto View Options->Waveform Pane. Check box "Highlight selected signals". This highlights selected signals.
3. To search for signal name, enter it in right hier and right case in "Find Signal). to search all hier, enter * at the end in "Scope", then it searches for everything under that hier. For ex:, if you are in digtop_tb hier, you will see "digtop_tb" in Scope. Just eneter * after that, i.e: /digtop_tb/*
4. To set an alias file fo state machine, etc, first select the signal that you want alias to be set to on the waveform viewer. Then select alias file as: waveform->Signal_value_radix->Add_alias_from_file, then choose the alias file and hit OK. alias file syntax is: states_timergen.alias
ALIAS timergen_sm
 PT_RESET          4'b0000
 PT_XG_INC         4'b0001
ENDALIAS

Verdi: superset of Debussy, as a lot more tools available.
------
    nCompare - Waveform compare (compare rtl and gate level waveforms).
    nSchema - Schematic browser(delay annotation).
    nState - State Diagram Debugger (Displays the Bubble Diagram of state machines)
    n Analyzer - Debug clock tree, clock and reset analysis,view multiple clock domains.
    nEco - Evaluate the changes made on the fly and validate them.
    SVTB - Gives the System Verilog Test Bench Inheritance view, class variables can be viewed synchronously with other signals on nWave.
    Assertion Evaluator - Evaluates System Verilog assertions off line without the simulator.
    Power Manager - Debug the UPF and CPF files and visualize the different power domains in the design
    Temporal Flow Wiew - Brings time,value and hierarchy on the same window


Running Debussy:
--------------------
Dir: /db/Hawkeye/design1p0/HDL/Debussy/

#Before we can run debussy, we need to generate fsdb file and do sdf_annotation (for gate sim) in irun. fsdb generation is not necessary, since debussy can convert vcd into fsdb on the fly. sdf annotation is also not a necessity since we can always run gatesims w/o sdf annotation, but then it's not very useful.

#generate fsdb: add following lines in top level verilog code. (+loadpli option should be used on irun cmd)
#File: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v

   initial
     begin
`ifdef FSDB => note FSDB was defined in cmd line of irun, so this section is valid. It generates fsdb which is proprietary.
        $fsdbDumpvars;
        $fsdbDumpfile(`FSDBFILE);

      #5000; //below cmds needed only if do not want dumping for all of sim time. Similar to vcd system task

      $fsdbDumpon; // This starts dumping

     #1000; //Dumps for 1000 time units startig from 5000 time units after sim starts

     $fsdbDumpoff; //this stops dumping
`endif
-----
#NOTE: In $fsdbDumpvars, we can also provide 2 arguments. 1st arg is name of block from which you want to dump fsdb, and 2nd var implies if we just want to dump for this block (1) or for all the hierarchy below it (0).  
ex: the code below dumps fsdb for digtop_tb (only top level since 2nd arg is 1), then dumps fsdb for digtop_00 which is a block within digtop_tb (all levels below it since 2nd arg is 0). The combined fsdb dump is in fsdbfile. So, in nWave, we'll see only digtop_tb. digtop_tb will contain digtop_00 module. digtop_00 module will contain all modules below it.
 $fsdbDumpvars(digtop_tb, 1);
 $fsdbDumpvars(digtop_00, 0);
 $fsdbDumpfile(`FSDBFILE);


-----

`ifdef VCD => if we need VCD (value change dump) which is std waveform database. Can be used with Novas Debussy as it supports both VCD and FSDB. See in verilog.txt for details on these system tasks.
        $dumpvars;
        $dumpfile(`VCDFILE);
`endif
     end // initial begin

SDF annotation: (for gate sims only)
--------------
annotator:
--------
The SDF file is brought into the analysis tool through an annotator. The job of the annotator is to match data in the SDF file with the design description and the timing models. Each region in the design identified in the SDF file must be located and its timing model found. Data in the SDF file for this region must be applied to the appropriate parameters of the timing model. SDF annotation is performed during elaboration, and can only take place at time 0.
2 ways to do sdf annotation:
-----------------------
A. $sdf_annotate utility:
Simulator only read compiled SDF file (sdf_filename.X). SDF src file is provided in $sdf_annotate and then it's compiled by the ncsdfc utility within elaborator to generate sdf_filename.X file, which is used by verilog-XL. Once *.X file is there, it can be used by the simulator for subsequent runs.
for SDF annotation, we need to do same thing as for fsdb/vcd dump file in top level module (digtop_tb). $sdf_annotate can only be in an initial block for verilog code, as it always takes place at time 0 only.

initial begin
      $sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Max_aligned.sdf", digtop_00,,"logs/sdf_max.log", "MAXIMUM"); // 7 args to sdf_annotate = name of sdf file, top level module inst name, cfgfile, logfile, MINIMUM/TYPICAL/MAXIMUM, scale_factor, scale_type.
#for min sdf ann
#$sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Min_aligned.sdf", digtop_00,,"logs/sdf_min.log", "MINIMUM"); => if sdf_annotate was called in some other module, then we had to specify the full hier, i.e. dut.digtop_00
end

NOTE: after running gatesim with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated or not annotated at all for such paths. Usually we get warnings like "ncelab: *W,SDFNEP: Unable to annotate to non-existent path ..." => this indicates that an arc was there in verilog model file (i.e in AN210.v), for which there was no corresponding arc found in sdf file. This usually happens with flops, where verilog models of flop (i.e SDC210.v) may have setup and hold arcs separate, while sdf file may have both combined as $setuphold, which may cause this warning. Arcs in sdf file came from .lib file, while sdf annotation is matching the arcs with std cell verilog model file. So, basically every arc in .lib file should match arcs in specify section of verilog model file. Sometimes we have conditional arcs in verilog (i.e arc from S->Y for MUX2). Corresponding arcs in .lib file are written with "sdf_cond : "!A&&B";" etc. "ifnone" arcs in verilog are written with no "sdf_cond" in .lib files. These arcs are written as "CONDELSE" in sdf files. Sometimes, some of these conditional arcs missing in .lib files can cause sdf files to be missing these arcs too. PT/ETS run using .lib files, so they may also have incoorect timing, as timing tools choose arc with worst/best possible timing, so if the missing arc has the worst/best timing, then the timing doesn't reflect that arc, resulting in incorrect timing.
NOTE: when generating sdf file, always use correct options, or some of the arcs might get removed from sdf file even though present in .lib files. One such example is using "CONDELSE" combo path arcs.

For ex: flop in SDC10.v has this in specify section:
     (CLK *> Q  ) = (0.100000:0.100000:0.100000 , 0.100000:0.100000:0.100000);
In SDF, 1st case shown below will pass while second will fail:
IOPATH CLK Q (1.0:1.0:1.0) (0.8:0.8:0.8) => pass
IOPATH (posedge CLK) Q(1.0:1.0:1.0) (0.8:0.8:0.8) => fail, since there no negedge/posedge clause in verilog model

Other warnings:
1. *W,NTCNNC: Non-convergence of negative timing check values in instance I_xyz/reg_5 => -ve timing check couldn't converge. see in verilog.txt for more details
2. *W,SDFNDP: Annotation resulted in a negative delay value or pulse limit to specify path or interconnect delay, setting to 0 => This happens when there are -ve values for delay in sdf file. Since simulator can't go back in time, it has to use 0 or +ve values. So, it sets all these -ve delay values to 0.
3. *W,SDFNEP: Unable to annotate to non-existent path (COND readcond (IOPATH CLK Q[24])) of instance DIG_TOP...U234 of module sshdbw00056025020 <../input/DIG_TOP_routed.fromPT.Min.sdf, line 169701> => This indicates that an arc was found in sdf but not in verilog model file. This usually happens with RAM/ROM IP, which may have intentional blackbox verilog models, which don't have any arcs.
NOTE: any of the above warnings do NOT cause missing annotations, as simulator runs with verilog arcs, and uses the default delay or the sdf delay for that arc. So extra arcs in sdf file are OK. Only when arcs are present in verilog but absent from sdf, is when we see unannotated arcs.

More options for sdf reporting:
1. -sdf_verbose: We can use option "-sdf_verbose" with irun cmd to print more detailed report in sdf.log file. With "-sdf_verbose" option, we'll see each cell instance, and the arcs annotated to it. It will have warnings (*W,SDFNEP) if while annotating a cell from sdf file, it's not able to find corresponding arc in verilog model file. Once all the cell arc annotation is done, we'll see "ABSOLUTE PORT:" delays, which show interconnect delay for getting to an i/p pin of each instance. This is taken from the "INTERCONNECT" delay section of sdf file. The reason, we only see i/p pins of cells and NOT the o/p pins is because interconnect delay is just needed for each i/p pin to form the full path. That is also the reason, why interconnect delays are not specified b/w 2 points (o/p of one gate to i/p of other gate), as it's not needed.
2. -sdfstats: If we want to have more sdf stats for unannotated arcs, we can run irun with options "-sdf_verbose -clean -sdfstats sdf_unannotated.txt". Then it shows a list of unannotated arcs with their corresponding cells. Arcs that are in verilog model, but not in sdf are the arcs that are left unannotated (and shows up as less than 100% annotation). In that case, simulator takes the default delay of such arcs from the verilog model file.

B. Cmd file:
Instead of using annotator cmd ($sdf_annotate), we can do sdf annotation using these 3 steps:
1. generate compiled sdf file using this cmd on the unix shell:
ncsdfc SPI.sdf -output SPI.blah => generates SPI.sdf.X in the current dir if no output file specified with -output.
2. wrt sdf cmd file: There are seven statements, which correspond to the seven arguments of the $sdf_annotate system task. Only one statement is required: the COMPILED_SDF_FILE statement, which specifies the compiled SDF file that you want to use. Others are optional (create cmd file named:  myfile.sdf_cmd) Note, file has to be terminated with a ;
COMPILED_SDF_FILE = digtop_func_W_125_1.62.sdf.X,
SCOPE = :pm7324_inst, => annotate to the VHDL scope :pm7324_inst, which may contain Verilog blocks. For us, it's :UUT or tb_digtop.dut.
LOG_FILE = "pm7324_flat.sdf.log", =>log
MTM_CONTROL = "TYPICAL", => min/typ/max. Indicates which triplet will be used.
SCALE_FACTORS = "1.0:1.0:1.0", => optional. mult factor for min/typ/max
SCALE_TYPE = "FROM_MTM"; => optional. scales timing specs FROM_MINIMUM/FROM_TYPICAL/FROM_MAXIMUM/FROM_MTM. i.e it indicates which of the 3 triplets will be used. For ex: if MTM_CONTROL = "TYPICAL", then we specify SCALE_TYPE = "FROM_TYPICAL".
3. #for ncelab, use ncelab -sdf_cmd_file filename option to include the SDF command file.
ncelab -sdf_cmd_file myfile.sdf_cmd worklib.top
#For irun, we can use the same option: irun .... -sdf_cmd_file myfile.sdf_cmd -sdf_verbose ...

When running irun, we see annotation message like this:
     Reading SDF file from location "/vobs/.../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf"
     Writing compiled SDF file to "/sim/.../../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf.X".
    Annotating SDF timing data:  ....    
    Annotation completed successfully...
    SDF statistics: No. of Pathdelays = 29695  Annotated = 100.00% -- No. of Tchecks = 38702  Annotated = 99.99% => Path_delays/Tchecks refer to ones in verilog model for cells, while Annotated refer to ones in sdf
                        Total        Annotated      Percentage
         Path Delays           29695           29695          100.00 => path delays refer to IOPATH in cell, and not to interconnect delay. Here verilog model IOPATH(under Total) for all cells match sdf IOPATH(under Annotated). Reason for mismatch would be when there's an extra gate in netlist but not in sdf file
             $period               2               2          100.00
              $width            6942            6942          100.00
             $recrem            4506            4506          100.00
          $setuphold           27252           27250           99.99 => 2 setuphold arc in verilog for which the annotator didn't find corresponding arc or timing in sdf. This needs to be fixed as they should match exactly at 100%.
NOTE: missing interconenct delays will be reported separately as "ncelab: *W,SDFINC: interconnect ... not connected to ..."

NOTE: If we provide non-existent sdf file in $sdf_annotate, then irun doesn't give any warnings. We don't see any annotation messages as shown above. Instead delays from verilog models (ex 0.01ns for gates when TI_verilog is defined) are taken, and annotation is done using those delays. As a result, we may see tons of timing violations for cells. Best way to find out is to pull up waveform and check delay for buffers/inverters and make sure they match those from sdf files.

------------
SDF file format is below in another section.

----------------
#Then we run Ncverilog or irun with loadpli1 (pointing to verdi PLI), and we get waveform dump. Then we start running debussy in separate dir to debug this waveform.

script: create_symbols for debussy/verdi:
-----------------------
creating symbols: Debussy/Verdi can display gate-level schematics using the proper symbols for the cells used in the netlist.  To enable this, you must set up a Debussy/verdi symbol library for the target cell library.  The symbol library can be created by running the utility syn2SymDB on the equivalent Synopsys Liberty (.lib) library.
syn2SymDB -o foo_u foo.lib foo1.lib =>
     -o:  Specifies output library name
      foo.lib:  Synopsys library name. Other lib can be added separating them by space
 This creates symbol library (directory) called foo_u.lib++.

NOTE: we can also run "vericom" compiler by synopsys to generate foo_u.lib++
cmd: vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => reads rtl files to generate VerdiLib.lib++

ex: just typing syn2SymDB may not work, so type the whole path
/apps/novas/debussy/2010.04/platform/LINUXAMD64/bin/syn2SymDB -o symbol \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CORE.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CTS.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_ECO.lib
=> creates symbol.lib++ dir.

You must reference this symbol library by setting the following two environment variables:
     setenv TURBO_LIBS "foo_u"
     setenv TURBO_LIBPATHS <path to the directory containing the symbol library directory>

We can also include these 2 variables in novas.rc file as:
   TurboLibs = symbol
   TurboLibPaths = /data/VIKING_OA_DS3/a0783809/debussy/lib
=> novas.rc gets loaded anytime debussy is invoked, so it looks in "lib" dir for "symbol.lib++" and adds all those symbols.

#Invoke Debussy and compile/load your netlist.
debussy -2012.04 /data/.../DIG_TOP_routed.v => This loads PnR netlist so that we can see schematic of this. (2012 version shows old gui, while later ones show new gui)
verdi /data/.../DIG_TOP_routed.v -upf2.0 Top.upf -upftop digtop => Loads PnR netlist into verdi (-upf loads upf to show various power domains in design. If loading upf, top module name for upf needs to be provided)


debussy quick tips:
------------
0. clicking on the AND gate symbol (2nd row 3rd col on gui) brings up the schematic.
1. When tracing loads, click on any net and click "Trace Load". Then from top, do tools->New Schematic->From Trace Results. This brings a new window which only shows net and all loads. This is helpful to see all loads on any net.
2. click Schematic->Find (or Caps A), and put name of nets/instance and it will show all. Select one that you need and click "c" to change color of that net.

script: run_debussy_rtl/run_debussy_gate: for gate runs and rtl runs
----------------------
run_debussy_rtl:

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy. NOTE: we don't really need these symbols since rtl only has clk gaters instantiated from library, so those will show as square box.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_rtl.f -vtop vtop.map -2001 -autoalias & => we can also just use "debusssy &"

list_rtl.f
-------
-f /db/DRV9401/design1p1/HDL/Source/digtop_rtl.f => has paths to all rtl files from source area: /db/DRV9401/design1p1/HDL/Source/digtop.v, global.v, etc
/db/DRV9401/design1p1/HDL/Testbenches/kagrawal/digtop/tb/digtop_tb.v => has path to top level tb block

run_verdi_rtl:

----------------

#invoke verdi to load RTL

vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => invoke vericom to create VerdiLib.lib++ from rtl files. (for some reason, this gives lots of error whe reading verilog packages. options "-2012 -ssv -ssy" seem to resolve all these errors. -2012 enables system verilog constructs (probably same as -sv), while "-ssv -ssy" enables verdi database for library cells.

verdi -lib VerdiLib -top digtop => Here we are loading VerdiLib.lib++, no need to specify RTL files, as lib++ already has lib built from rtl from earlier step (when running vericom)

verdi -f list_rtl.f => This loads list_rtl.f directly instead of generating lib thru vericom. For some reason, this gives lots of errors with packages.

vtop.map: debussy accesses already dumped fsdb files. The map file maps hier in fsdb to that in RTL.
-------
digtop = digtop_tb.digtop_00 => this provides the hier path to the dut (digtop_00 is the instance name of digtop [digtop is top level RTL module] instantiated in digtop_tb)

run_debussy_gate: same as with rtl except that we run it directly on gate netlist:
---------------
list_gate.f
/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v => path to gate level netlist
../Testbenches/digtop/tb/digtop_tb.v => path to top level tb

vtop.map
-------
digtop_top = digtop_tb.digtop_00 (If top level module in gate netlist is called digtop_top, then that is what we specify. This digtop_top ties gate level netlist with tb file)

NOTE: If we get *.vcd file from analog team, then to run debussy, we need to map the hier from .vcd file to our gate netlist. so, in vtop.map:
digtop_top =  zorro_toplevel_sch.I3.I7.I0 (here I0 at the end refers to the inst of "digtop_top" module in gate level netlist digtop_VDIO.v. zorro_toplevel_sch is the schematic name within which we have I3 top level block, which contains digital wrapper I3 within which we have digital block I0)

running debussy when debugging RTL:
----------------------------------
Bring up Debussy nTrace. goto source-> mark Parameter annotation and active annotation.
Now, open nWave by going to Tools->New Waveform.
Now, we can drag and drop signals from nTrace to nWave and vice vera, and observe signals.
1. We can click on clk edge in nWave and that will show which values changed.
2. We can click on signal names in nTrace and it will backtrace it.
2. We can click "c" on any net, and we can set net to chosen color.
3. We can open 2 nWave from nTrace by going to Tools->New Waveform. this way, we open 2 nWave window. we can goto nWave "window" button and turn ON sync waveform view. We do it for both the windows so that clicking on cursor in any one of them, will affect the other (if we do it for only one of them, then clicking on cursor in that window will affect the other, but not the other way around). Then the 2 nWave windows will be synced in time, so that it's easier to compare results (for ex b/w RTL and gate)
4. NOTE: when we open nWave using Debussy and do active annotation, we will see the name of the fsdb file on the top panel of debussy window. That is the fsdb file that is actively annotated with the current RTL that we see in the RHS of debussy main window. If we open open any other nWave window and any other fsdb file, it will NOT be actively annotated with that RTL. To actively annotate other fsdb file, we goto nWave window of new fsdb, click on Window->change to primary. This changes this new nWave window to be actively annotated with the current RTL (we will see the name of this new fsdb file on the top panel of debussy window). So, we can switch back and forth b/w multiple nWave window.
4. NOTE: sometimes when we load new fsdb from nWave window, it may not get annotated properly with rtl. So, best way to open a new fsdb is to do it from Debussy panel. In debussy, goto File->close simulation results. this kills the current fsdb, but retains all the signals, so that we don't have to save it. Now, do File->load simulation results and open the new fsdb. This is correct way to view new fsdb.

running debussy when looking at gate netlist for ECO:
----------------------------------------------------
run_debussy_eco: here we are just looking at schematic of gate netlist, so we invoke debussy with just gate level netlist.

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_gate.f & => list_gate.f has path to gate level netlist /db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v
#debussy & => If we call debussy w/o -f option, then we have to do File->Import design, Put the file name (/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v) in bottom box, and then click Add, then OK.

Then click on Tools->New Schematic->Current scope

patgen files:
-----------
Ex: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/patgen

verilog models used: (for lbc7)
--------------------
A. MODEL_functiononly: timescale is 1ps/1ps. It has following delays specified:
 1. gates (AN2,etc) = 0
 2. clk gating cells (CG*) = 0
 3. flops, c2q delay = 1ps. For ex: in DTP20.v (in lbc7), "buf" and "not" gates are specified delay of #1(1ps), so final o/p Q/QZ have delay of 1ps.

B. MODEL_verilog: timescale is 1ns/1ps. It has following delays specified:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.

C. If nothing defined (neither MODEL_functiononly nor MODEL_verilog). timescale is 1ns/1ps. same as MODEL_verilog except no checks done:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.

NOTE: we used "specify" instead of putting delays as "#" so that when we sdf annotation, it will disregard delays in specify section. If we hardcoded delays as #, then we would have double counted the delay as sdf annotation would have happened on top of existing delay in verilog model.
NOTE: Only delay numbers are disregarded in specify section, all arcs (c2q, setup/hold, rec/rem, width, etc) are still honored and transfer is passed to appr notifier in the verilog model (using delay numbers from sdf file).

NOTE: When we run AMS sims, we run toplevel sims directly on digital schematic which is a gate level netlist. We don't have sdf file to annotate delays for gates. So, we set "MODEL_functiononly" as that will cause no setup/hold issues. Flops will always have 1ps delay, so they will always have enough setup time, and since all comb gates on clk have "0" delay, there will be no hold issues.  If we run it with MODEL_verilog (or with nothing defined), then hold issues may show up, which may not be actually present in ckt. This hold will show up if we've (c2q+data_path_delay < clk_path_delay). Usually clk+path has < 10 clk buffers, so no hold issues. However, if even 1 clk gating cell gets added on clk path, then hold will get violated as clk will change before data (if no of clk buffers is greater than no of gates in data path) or at same time as data (if no of clk buffers is same as no of gates in data path).

modeling delays in simulations:
------------------------------
By default, verilog gate level models, and, interconnect delays are always simulated as transport delays, but they look as if they are simulated as pure inertial delays (since they don't allow glitches shorter than prop delay to pass thru). This is beacuse, by default, pulse_r and pulse_e are set to 100%. These are verilog cmd line switches that can be used to alter this behaviour for gate level sims. delays inside of specify blocks are affected, when this cmd line switches are used with simulators: (add +transport_path_delays also)
A. +pulse_r/R% : switch forces pulses that are shorter than R% (R=0 to 100) of the propagation delay of the device being tested to be "rejected" or ignored.
B. +pulse_e/E% : switch forces pulses that are shorter than E% (E=0 to 100) but longer than %R of the propagation delay of the device being tested to be an "error" causing unknowns (X's) to be driven onto the output of the device. Any pulse greater than E% of the propagation delay of the device being tested will propagate to the output of the device as a delayed version of the expected output value.
scenarios are as below:
  0% ------  R%  -------   E% ------  100%
 --- reject  --  error(x)  -- output  --- => So, glitches can be rejected, output an x or get out as normal delayed version depending on settings.

Ex: vcs -RI +v2k tb.v delaybuf.v +pulse_r/0 +pulse_e/0 +transport_path_delays => causes pulses shorter than 0% to be rejected, and pulses greater than 0% to be propagated to the o/p. => all pulses are passed, no matter how small.
Ex: +pulse_r/0 +pulse_e/100 => causes no glitches to be rejected, but o/p x, for glitches shorter than propagation delay.
Ex: +pulse_r/100 +pulse_e/100 => models inertial delays, where all pulses shorter than propagation delay are ignored.
Ex: +pulse_r/20 +pulse_e/20 => causes  glitches <20% to be rejected, but glitches >20% to be passed.

NOTE: when we run gate sim, we may start seeing "glitch suppression" warnings (many times after adding pulse_r/pulse_e switches).
EX: Warning!  Glitch suppression
           Scheduled event for delayed signal of net "GVC_D_D" at time 1027453294 PS was canceled!
            File: /db/pdkoa/lbc8lv/current/diglib/msl458/PAL/CORE/verilog/SDP10B_LL.v, line = 92
           Scope: tb_digtop.dut.I_i2c_top.I_bellatrix_i2c_slave.I_meson_i2c_fsm.bitCnt_reg_2
            Time: 1027453096 PS

Glitch suppression: This happens when there are -ve timing values, which causes simulator to use delayed signals. When a delay with two values is calculated, there is the possibility that an event on the input net may cancel a scheduled event on the internal signal driven by the delay. This is called glitch suppression.Because  glitch  suppression  can  hide  input  events  from  a  timing  check's  input,  the simulator generates a glitch suppression timing violation if an event on a delayed signal is canceled.
To suppress the warnings due to the glitch suppression algorithm, use the -nontcglitch simulation option  

NOTE: the above cmd line switches only valid for delays in specify block, not for delays using SDF annotation. For sdf delays, we need to have these in absolute numbers within sdf file for each cell.
NOTE: to specify reject/error, we need to have extra paranthesis, like this:
ex: (IOPATH A Y ((rise_delay) (rise_reject) (rise_error)) ((fall_delay) (fall_reject) (fall_error)) ) => extra parantheses, empty parantheses for reject/error imply that reject/error is set equal to delay value => inertial delay model
ex: (IOPATH A Y (rise_delay) (fall_delay)) => no extra brackets, so values are delay values. no reject/error values.
ex:
(CELL
  (CELLTYPE "IV110")
  (INSTANCE U32)
  (DELAY
    (ABSOLUTE
    (IOPATH A Y ((0.066:0.066:0.066) (0.015:0.015:0.015) (0.019:0.019:0.019)) ((0.059:0.059:0.059) (0.012:0.012:0.012) (0.017:0.017:0.017))) => 66ps for o/p rise delay, 15ps is rise reject limit while 19ps is rise error limit. 59ps for o/p fall delay, 12ps is fall reject limit while 17ps is fall error limit.
    )
  )
)
Ex: we can also use "PATHPULSEPERCENT" keyword in sdf file to specify reject and error limits in % terms.
    (IOPATH A Y (0.066:0.066:0.066) (0.059:0.059:0.059))
    (PATHPULSEPERCENT A Y (25) (35)) => 25=pulse reject limit in %, 35=pulse error limit in %
-----------------------

SDF file syntax: ( /db/Hawkeye/design1p0/HDL/Primetime/digtop/sdf/digtop_max.pt.sdf)
-----------------
OVI (open verilog intl) developed SDF v3 syntax. timing calc tools (PT,etc) are resp for generating SDF.

syntax:
------
(DELAYFILE
(SDFVERSION "OVI 3.0")
(DESIGN "digtop")
(DATE "Thu Jul 21 20:22:34 2011")
(VENDOR "PML30_W_150_1.65_CORE.db PML30_W_150_1.65_CTS.db")
(PROGRAM "Synopsys PrimeTime")
(VERSION "D-2010.06")
(DIVIDER /) => hier divider is / (by default, it's .) a/b/c
// OPERATING CONDITION "W_150_1.65" => // is for comment
//triplets are always in form - min:typ:max for delay
(VOLTAGE 1.65:1.65:1.65) => best:nom:worst
(PROCESS "3.000:3.000:3.000") => best:nom:worst
(TEMPERATURE 150.00:150.00:150.00) => best:nom:worst
(TIMESCALE 1ns) => implies all delay values are to be multiplied by 1ns
//delays specified in CELLS for both interconnect and cell delay.
//interconnect delays => we may have the block below repeated many times as only some wires may be in each block. It's easier for readability. interconnect delays are always between 2 points => o/p of one gate to i/p of other gate.
(CELL => inter connect delays specified here. interconnect delays of order of ps (vry small), while cell delays of order of ns. All INTERCONNECT delays are only specified for top level module (digtop). For wires which are not in digtop, heir names are used.
  (CELLTYPE "digtop")
  (INSTANCE) //no instance specified, implying it's interconnect delay
  (DELAY =>
    (ABSOLUTE => delay can be ABSOLUTE or INCREMENT
    (INTERCONNECT scan_out_iso/U282/Y em_out_31_I_bufx4/A (0.008:0.008:0.008) (0.008:0.008:0.008)) //rise/fall (min:typ:max)delays. min:typ:max are same delays for one sdf file as we use separate sdf files for min/typ/max corners. However, if we use newer tools as tempus to generate sdf, we may see (0.41::0.62), which indicates  that for sdf generated at particular corner (say NOM.sdf), we may have different values for min,typ,max. In timing tools for OCV runs, for a giver corner (say NOM), min value in triplet is used for clk, max for data (for setup check) and viceversa for hold. However for gatesim, it takes only one value for all paths, and we specify what triplet value to use (by stating "MAXIMUM","TYPICAL" or "MINIMUM" in sdf_annotate). So, ideally, we should run gate sims with "MAX" triplet  with QC_MAX.sdf, and "MIN" triplet with QC_MIN.sdf. "MAX" and "MIN" triplet with QC_NOM.sdf is not really needed as it will be bounded by MAX/QC_MAX.sdf and MIN/QC_MIN.sdf.
    (INTERCONNECT scan_out_iso/U164/Y a2d_trg_out_I_bufx8/A (0.001:0.001:0.001)) //same delay for rise/fall (NOTE: hier names used)
    ...
    )
  )
)   
//cell delays
(CELL => delay for cells: delays for each instance defined separately, since it may be diff based on load.
  (CELLTYPE "NA210") =>nand gate
  (INSTANCE test_mode_dmux/U85) => in test_mode_dumx module. since instance specified, it's cell delay
  (DELAY
    (ABSOLUTE
    (IOPATH A Y (0.129:0.129:0.129) (0.170:0.170:0.170)) //rise/fall for Y (min:typ:max)delays. We don't specify rise/fall for A as it's automatically decided based on direction of Y.
    (IOPATH B Y (0.157:0.157:0.157) (0.158:0.158:0.158))
    (COND !A&&!B (IOPATH Y S  (0.630::0.641) (1.470::1.476))) //some complex cells(adders, etc) will have cond delay arcs.
    )
  )
)
(CELL => flop delay. flops will have delay arcs as well as timing check arcs.
  (CELLTYPE "TDC10")
  (INSTANCE spi/data_reg_15)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK QZ (0.622:0.622:0.622) (0.624:0.624:0.624)) => NOTE: sdf doesn't say rise/fall of CLK in IOPATH. Only rise/fall of QZ. However, model file specifies QZ delay wrt posedge or negedge CLK. So, there's always this discrepancy b/w verilog model file and sdf file for all IOPATH.
    (IOPATH CLK Q (1.308:1.308:1.308) (0.874:0.874:0.874))
    (IOPATH CLRZ QZ (0.936:0.936:0.936) ())
    (IOPATH CLRZ Q () (1.217:1.217:1.217))
    )
  )
  (TIMINGCHECK => checks
    (WIDTH (posedge CLK) (0.176:0.176:0.176)) => min allowable time for +ve(high) pulse of clk
    (WIDTH (negedge CLK) (0.692:0.692:0.692)) =>  min allowable time for -ve(low) pulse of clk
    (WIDTH (negedge CLRZ) (0.330:0.330:0.330)) => min allowable time for -ve(low) pulse of clrz
    (SETUPHOLD (posedge D) (posedge CLK) (0.437:0.437:0.437) (-0.263:-0.263:-0.263)) => setup and hold checks for rising edge of D wrt +ve clk. first triplet(0.437) is for setup, while second(-0.263) is for hold. triplets are min:typ:max delays. SETUP and HOLD can also be separated by using SETUP and HOLD keywords. NOTE: setu is +ve, while hold is -ve (typically true for flops as data lines inside flops have extra gates before they hit clk logic)
    (SETUPHOLD (negedge D) (posedge CLK) (0.716:0.716:0.716) (-0.288:-0.288:-0.288)) => similarly for falling edge of D
    (SETUPHOLD (posedge SCAN) (posedge CLK) (0.954:0.954:0.954) (-0.592:-0.592:-0.592))
    (SETUPHOLD (negedge SCAN) (posedge CLK) (0.659:0.659:0.659) (-0.538:-0.538:-0.538))
    (SETUPHOLD (posedge SD) (posedge CLK) (0.472:0.472:0.472) (-0.317:-0.317:-0.317))
    (SETUPHOLD (negedge SD) (posedge CLK) (0.756:0.756:0.756) (-0.332:-0.332:-0.332))
    (RECREM (posedge CLRZ) (posedge CLK) (0.405:0.405:0.405) (0.084:0.084:0.084)) => recovery check  is like setup check for clrz where it should go inactive sometime before the clk., so that flop i/p can get flopped. Removal check is like hold check for clrz where it should go inactive sometime after the clk, so that flop i/p doesn't get flopped that cycle, but the next cycle. RECREM combines RECOVERY ans REMOVAL checks in one. 1st triplet(0.405) is recovery, 2nd(0.084) is removal.
  )
)
(CELL => clkgater delay
  (CELLTYPE "CGPT40")
  (INSTANCE hwk_regs/clk_gate_ccd_brightness_out_reg/latch)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK GCLK (0.642:0.642:0.642) (0.610:0.610:0.610))
    )
  )
  (TIMINGCHECK
    (WIDTH (negedge CLK) (0.538:0.538:0.538))
    (SETUPHOLD (posedge TE) (posedge CLK) (0.701:0.701:0.701) (-0.448:-0.448:-0.448))
    (SETUPHOLD (negedge TE) (posedge CLK) (0.795:0.795:0.795) (-0.508:-0.508:-0.508))
    (SETUPHOLD (posedge EN) (posedge CLK) (0.468:0.468:0.468) (-0.214:-0.214:-0.214))
    (SETUPHOLD (negedge EN) (posedge CLK) (0.721:0.721:0.721) (-0.430:-0.430:-0.430))
  )
)
(CELL => latch delay
  (CELLTYPE "LAH11")
  (INSTANCE flipper_top/flipper_ram/flipper_ram_reg_185_5)
  (DELAY
    (ABSOLUTE
    (IOPATH C Q (0.826:0.826:0.826) (1.097:1.097:1.097))
    (IOPATH D Q (0.622:0.622:0.622) (1.250:1.250:1.250))
    )
  )
  (TIMINGCHECK
    (WIDTH (posedge C) (0.733:0.733:0.733))
    (SETUPHOLD (posedge D) (negedge C) (0.464:0.464:0.464) (-0.339:-0.339:-0.339))
    (SETUPHOLD (negedge D) (negedge C) (1.116:1.116:1.116) (-1.079:-1.079:-1.079))
  )
)

(CELL //hard IP
    (CELLTYPE  "ophdll00032008040") => otp
    (INSTANCE  I_i2c_top/I_bellatrix_i2c_otp/I_otp_32x8)
      (DELAY
    (ABSOLUTE
    (IOPATH CLK Q[0]  (27.7495::27.7495) (7.7705::7.7705)) ...
    (IOPATH CLK Q[7]  (27.7482::27.7482) (7.7696::7.7696))
    (COND WRITECOND (IOPATH CLK BUSY  () (18.6512::18.9356)))
    (COND READCOND (IOPATH CLK BUSY  (5.5594::5.5594) (73.8027::73.8027)))
    (IOPATH PROG BUSY  (17.0202::17.2431) (18.7960::19.0702))
    )
      )
      (TIMINGCHECK
    (SETUPHOLD (posedge READ) (posedge CLK) (23.4053::23.4053) (4.3649::4.3649)) ...
    (WIDTH (COND WRITECOND (posedge CLK)) (50000.0000::50000.0000)) ... => This WRITECOND should be there in verilog model of otp else tool will complain about missing "WRITECOND". This "WRITECOND" initially came from .lib file.
    (PERIOD (COND WRITECOND (posedge CLK)) (50202.0000::50202.0000)) ...
    (SETUPHOLD (posedge D[0]) (posedge CLK) (0.1406::0.1406) (5.4447::5.4447)) ...
    (SETUPHOLD (negedge A[4]) (posedge CLK) (0.7298::0.7298) (4.0518::4.0518))
    (SETUPHOLD (negedge PROG) (posedge READ) (163.5221::163.5221) ())
    (SETUPHOLD (negedge CLK) (posedge READ) (163.5160::163.5160) ())
      )
)

-------

SDF supports both a pin-to-pin and a distributed delay modeling style. We use pin to pin.
SDF supports setup, hold, recovery, removal, maximum skew, minimum pulse width, minimum period and no-change timing checks.
interconnect delay: SDF supports two styles of interconnect delay modeling.
A. The SDF INTERCONNECT construct allows interconnect delays to be specified on a point-to-point basis from o/p port of one device to i/p port of other device. This is the most general method of specifying interconnect delay.
B. The SDF PORT construct allows interconnect delays to be specified as equivalent delays occurring at cell input ports. This results in no loss of generality for wires/nets that have only one driver. However, for nets with more than one driver, it will not be possible to represent the exact delay.

cell delay: SDF supports 2 types of cell delay.
A. IOPATH implies delay from i/p port of device to o/p port of same device. We use this for all simple cells.
B. COND implies conditional i/p to o/p path delay. We use this for complex cells (adders, etc).

************************************************

Makefile:


make utility in unix is an interpretor for Makefile. Makefile is like a shell script (similar to test.csh, run.bash, etc). The only difference is that Makefile is not an executable (not sure why it's not required to be executable, as it may have unix cmds in it, and can be run by anyone). We write the script in a file called Makefile (note capital M in Makefile). Then we run the interpretor called "make", which interprets this Makefile (Makefile is the default file make looks for, we can also specify other files for make to look at) and produces desired outcome. make uses rules in Makefile to run (Makefile is placed in same dir as where make is run). It is very important utility in Linux, as many programs/applications use Makefile to generate executable files. If you want to write and compile your own large program, Makefile is essential there too. Makefile is basically a file which says what actions to take, depending on what dependencies it has. Makefile was written because some programmer forget to recompile a file that had changed. that caused him many hours of wasted time. make was written so that it could keep track of what changed, and recompile the needed files automatically, w/o user bothered with it. We can entirely do away with Makefile if we can manage everything manually.

Makefile is used extensively in generating executables for programs as C. It's also used as wrapper for calling multiple bash/csh scripts via just 1 cmd.

Very good and short intro to Makefile is here:

http://www.jfranken.de/homepages/johannes/vortraege/make_inhalt.en.html

Authentic make documentation from GNU: http://www.gnu.org/software/make/manual/make.html

Makefile Syntax:

Makefile has separate lines. Each line ends with a newline (no ; needed unless we put multiple cmds on same line). If we want to continue current line to next line, we can use "\" at end of line (before entering newline). That way "make" sees next line as continuation of previous line. General syntax of Makefile is ( [ ] implies it's optional):

target [ more targets] :[:] [ prerequisites ] [; commands] => If we want to put cmds along with prerequisites, then we need to have ; to separate cmds from prerequisites.

[ <tab> commands ] => Note that there needs to be a tab (multiple spaces) before commands in every line.

[ <tab> commands ]

...

Makefile consists of 4 things:
1. comments: start with #
2. defn of variables/functions: myvar = adb, or myvar := adb (spaces don't matter. i.e myvar=adb is fine too). Now myvar can be accessed anywhere in Makefile by using $(myvar). We should always use $(myvar) instead of $myvar, as myvar may not be treated as single var when not nside (), causing $myvar to be expanded as $m followed by yvar. We can apply var to subset of string. i.e: $(myvar)_module/ABC/X will expnad to adb_module/ABC/X. Curly braces {} can be used too. ". =" expands var/func at use time, while ":=" expands them at declaration time.

myvar ?= adb => ?= is a conditional assignment and assigns a value to var only if that var hasn't been defined previously.

    ex:  XTERM_TITLE = "$@: `pwd`"

    ex: ARGS = -block $(DES) $(SCR)/tmp/a.tx -name kail


3. Includes: to include other Makefile, since 1 Makefile may get too big. When "-" is placed in front of any cmd, it ignores that cmd in case of errors and moves on. If - not placed before a cmd, and that cmd fails, then make aborts. -include Makefile.local will execute Makefile.local, and if Makefile.local is mssing, then it will keep on mocing (w/o aborting)

4. Rules: hello: a; @echo hello

5. other cmds: all unix cmds that can be used in shell, can be used in Makefile. (ex: echo, for ... do .. done, export, etc). These cmds can be put directly on action/cmd line of rules.

ex: @rm -rf $(DES) => removes files. @-rm -rf * => - in front causes the cmd to be ignored if there's any error executing the cmd, and make moves forward with next line.

Rules: We will talk about rules, since they are the heart of Makefile:
ex: below ex defines rules for target hello & diskfree.  Rules have multiple lines. 1st line is rule or dependency (or prerequisite) line, 2nd line is action line. 1st line says that the prerequisite has to be satisfied before 2nd line can be run. It will check to see if the prerequisite is upto date based on it's own dependencies, if so it will run action line, else it will run prerequisite to make it upto date based on it's dependencies. (1st variable on the rule line (ie "hello" in ex below) is the name of the target that can be specified on unix cmd line as "make hello")
hello: ; => dependency line (or pre-requisite line): It's blank here as we don't have any dependency. We can put a ; at end of line if we want to put next cmd on this line itself, otherwise it's not needed.
       @echo Hello => action line: running "make hello" outputs "Hello" on screen. @ prevents make from announcing the command it's going to do. So, @ prevents the cmd "echo Hello" from getting printed on screen. Since echo is already printing "Hello" on screen, we do not want 2 lines to be printed on screen (i.e "echo Hello" followed by "Hello"). That's why we put a @
diskfree: ;
          df -h => running "make diskfree" runs "df -h" which outputs diskfile usage on screen. Since @ is not used, it outputs the cmd "df -h" on screen

#create a Makefile and copy the above 2 lines in it. Then run 2 cmds below on cmd line in shell. If we don't tell make a target, it will simply do the first rule:
make => will do target hello, resulting in "Hello" on screen.
make diskfree hello => it will do these targets in order, first diskfree, then hello.

Make options:

There are various options that can be used when running make. They can be found on gnu link above. Some of the imp ones are below:

#above we did not specify which makefile to use.  It uses Makefile in current dir. To be specific, we say:
make -f makefile.cvc => runs make on this makefile.cvc. no target specified, so does the first rule, which is "cvc" which makes binary for cvc. To run clean, do:
make -f makefile.cvc clean => runs rule "clean" for this Makefile.

make -C dir1/dr2 all => -C specifies the dir where Makefile is. all is a convention. "all" rule is defined which runs sub targets to build the entire project. Since usually we want to run "all", we put it as first target, so that just running "make" will run "make all".

make hello -n => -n option shows what all steps will be run for target "hello" without actually running the steps. This helps us in understanding the sequence of steps that will be run, when analyzing a Makeful. This is very useful in debug and used a lot. Always run any target with "-n" option to check what it's going to do, and then run it without "-n" to run it.

pattern matching : % can be used match multiple targets. ‘%’ which acts as a wildcard, matches any number of any characters within a word. Ex:

final_%_run: %.out ;

           @echo "Hell" => Now, when we run "make final_2ab_run", then this rule gets run, as target matches name in Makefile with % matching "2ab". It has a dep 2ab.out (since wildcard % is assigned 2ab). We cannot use % in action line, as it's not substituted with "2ab". If we run "make final_abc_run", then again the same target gets run, but now % is replaced by abc. So, dep is now abc.out. NOTE: when we use % notation, then "make" w/o any target will error out, as there's no default matching target.

Phony targets:

.PHONY: By default, Makefile targets are "file targets" - they are used to build files from other files. Make assumes its target is a file. i.e "hello: abc ;echo .." implies hello and abc are files (hello and abc files are generated via cmds in Makefile). It looks at timestamp of hello and abc files to decide what to do. However, many tagets such as "clean", "all", "install" are not files, so if there is a file with same name, make will start looking at timestamp of such files to determine what to do.  The .PHONY directive tells make which target names have nothing to do with potentially existing files of the same name. PHONY implies these targets are not real.  .PHONY target implies target that is always out of date and always runs ignoring any time stamps on file names. ex: .PHONY setup => this will cause setup to be run irrespective of the state of file "setup" if any.

automatic variables: On top of var defined by user, we also have in built var:

  •  $@ in action line suubstitutes it with target name
    ex: hello: ;
               @echo printmsg $@ => prints "printmsg hello" on screen.
     
  • $< in action line substitutes it with name of 1st pre-requisite in dep line. To get names of all prereq with spaces in b/w them, use $^ or $+ ($^ removes duplicate prereq, while $+ retains all of them).
    ex: tiger.pdf: tiger.ps; ps2pdf $< => make tiger.pdf will run this cmd: ps2pdf tiger.ps

 



------
1. Example of Makefile and make:

ex: executable "sum" is to be generated from 2 C files (main.c,sum.c) and 1 h file sum.h, which is included in both c files.
run make => reads Makefile, creates dependency tree and takes necessary action.


#binary exe
sum: main.o sum.o => dependency line states that exe sum depends on main.o and sum.o
      cc -o sum main.o sum.o => cmd/action line states how pgm should be compiled if 1 or more .o files have changed.

#main dep
main.o: main.c sum.h => dep line for main.o. we can omit main.c, since built in rule for make says that .o file depends on corresponding .c file.
        cc -c main.c => cmd line stating how to generate main.o

#sum dep
sum.o: sum.c sum.h => similarly we can omit sum.c from here
       cc -c sum.c

#above 2 dep, main & sum can be combined into 1:
main.o sum.o: sum.h     => means both .o files depend on sum.h (dep on main.c and sum.c is implied automatically)
              cc -c $*.c => macro $*.c expands to main.c for main.o and sum.c for sum.o



2. Example of Makefile and make:

ex: executable for arm processor:

Makefile: /data/tmp/Makefile
run:  make TGT=hellow

#define variables doe compiler, assembler, linker and elf
tool-rev        =       -4.0-821
CC              =       armcc $(tool-rev)

#compiler options
CCFLAGS         =       $(CPUTARGET) -I $(INCPATH) -c --data_reorder \
                        --diag_suppress=2874 \
                        --asm

#dependency rule to state
IKDEPS_MAIN     =       CMSIS/Core/CM0/core_cm0.h CMSIS/Core/CM0/core_cm0.c cm0ikmcu.h IKtests.h IKtests.c IKConfig.h debug_i2c.h sporsho_tb.h Makefile
IKDEPS          =       $(IKDEPS_MAIN) debugdriver
IKOBJS          =       boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o

# Performance options (adds more options to compiler flag)
CCFLAGS         +=      -O2 -Otime -Ono_autoinline -Ono_inline

#Rules to create dependency tree
#top level target depends on TGT.bin
$(TGT):         $(TGT).bin => TGT depends on TGT.bin
                @echo => cmd line states that dont echo the cmd.
 
#expands to fromelf -4.0-821 --bin -o kail_rtsc.bin kail_rtsc.elf
$(TGT).bin:     $(TGT).elf
                $(FROMELF) --bin -o $@ $<

#expands to armlink -4.0-821 --map --ro-base=0x0 --rw-base=0x20000020 --symbols --first='boot.o(vectors)' --datacompressor=off --info=inline -o kail_rtsc.elf kail_rtsc.o boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o
$(TGT).elf:     $(TGT).o $(IKOBJS)
                $(LD) $(LDFLAGS) -o $@ $(TGT).o $(IKOBJS)

#expands to armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm  -O2 -Otime -Ono_autoinline -Ono_inline -o kail_rtsc.o kail_rtsc.c
$(TGT).o:       $(TGT).c $(IKDEPS)
                $(CC) $(CCFLAGS) -o $@ $< => $@ = TGT.o, $< = TGT.c

#all specifies what to run when no target is specified, i.e when we run just "make"
all:    debug

#similarly we specify rules for debug_i2c.o, sporsho_tb.o, boot.o, boot_evm.o, sporsho1_lib.o, retarget_cm0ikmcu.o, system_cm0ikmcu.o, system_cm0ikmcu_evm.o, IKtests.o.
#ex for IKtests. expands to armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm -O2 -Otime -Ono_autoinline -Ono_inline -o IKtests.o IKtests.c
IKtests.o:      IKtests.c $(IKDEPS_MAIN)
                $(CC) $(CCFLAGS) -o $@ $<

#clean specifies that rm everything when running "make clean"
clean:
                rm -f *.bin *.elf *.o *.s *~
-----------------------------------

Advanced section:

1. Function for string substitution:

A. subst => simple substitution => $(subst from,to,text) => Performs a textual replacement on the text text: each occurrence of from is replaced by to. The result is substituted for the function call.

ex: Var1 = $(subst ee,EE,feet on the street) => new string "fEEt on the strEEt" assigned to Var1

B. patsubst => pattern substitution => $(patsubst pattern,replacement,text) => Finds whitespace-separated words in text that match pattern and replaces them with replacement. Here pattern may contain a ‘%’ which acts as a wildcard, matching any number of any characters within a word. If replacement also contains a ‘%’, the ‘%’ is replaced by the text that matched the ‘%’ in pattern. 

ex: Var2 = $(patsubst %.c,%.o,x.c.c bar.c) => .c replaced with .o, everything else is copied exactly as it's % on both pattern and replacement. so, final value assigned to Var2 is "x.c.o bar.o"

2. substitution reference: A substitution reference substitutes the value of a variable with alterations that you specify. It's identical to patsubst function above, so there's really no need for this, but it's provided for compatibility with some implementations of make.

Form => $(var:a=b)’ or ‘${var:a=b} => () or {} mean same thing. Takes the value of the variable var, replace every "a" at the end of a word with "b" in that value, and substitute the resulting string. Only those "a" that are at end of word (i.e followed by whitespace) are replaced. All other "a" remian unaltered. This form is same as patsubst => $(patsubst a,b, $(var))

foo := a.o b.o c.o
bar := $(foo:.o=.c) => It says to look at all words of var "foo", replace every word wherever "o" is the last character of that word with "c". Then assign this modified string to var "bar". So, bar gets set to "a.c b.c c.c". Here wild card matching of patsubst not used.
bar := $(foo:%.o=%.c) => Here wildcard char % is used for matching. So, bar gets set to "a.c b.c c.c". Here wild card matching of patsubst is used.

ex: following is in a makefile to create different target based on what TOP_MOD is being set to. TOP_MOD is assigned value from cmd line or from some other file: TOP_MOD := abc

target1_me: ${TOP_MOD:%=%.target1_me} ; => whenever we run target1.me from make cmdline, it calls this target. It has dependency specified within ${..}. Since this is substitution reference, the whole ${ .. } get assigned abc.target1_me. So, make looks for target abc.target1_me.
%.target1_me: ; echo "Hell"; => make finds this target as % expands to abc, so it starts running this target with whatever action it's supposed to do. In effect, we redirected flow to" abc.target1_me" target. This is helpful in cases where same target needs to be run with multiple times, but with different options. 

ex:

Difference in DC(design compiler) vs EDI(encounter digital implementation): ----------------------- 1. many of the cmds work on both DC and EDI. Biggest difference is in the way they show o/p. in all the cmds below, if we use tcl set command to set a variable to o/p of any of these cmds, then in DC it contains the actual object while in EDI, it contains a pointer and not the actual object. We have to do a query_objects in EDI to print the object. DC prints the object by using list. 2. Unix cmds don't work directly in EDI, while they do in DC. So, for EDI, we need to have "exec" tcl cmd before the linux cmd, so that it's interpreted by tcl interpreter within EDI. 3. Many new tcl cmd like "lassign", etc don't work in EDI. 4. NOTE: a script written for EDI will always work for DC as it's written as pure tcl cmds. Design compiler: --------------------- Register inference: (https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcrmo/dcrmo_8.html?otSearchResultSrc=advSearch&otSearchResultNumber=2&otPageNum=1#CIHHGGGG) ------- On doing elaborate on a RTL, HDL compiler (PRESTO HDLC for DC) reads in a Verilog or VHDL RTL description of the design, and translates the design into a technology-independent representation (GTECH). During this, all "always @" stmt are looked at for each module. Mem devices are inferred for flops/latches and "case" stmt are analyzed. After that, top level module is linked, all multiple instances are uniqified (so that each instance has unique module defn), clk-gating/scan and other user supplied directives are looked at. Then pass 1 mapping and then opt are done. unused reg, unused ports, unused modules are removed. #logic level opt: works on opt GTECH netlist. consists of 2 processes: A. structuring: subfunctions that can be factored out are optimized. Also, intermediate logic structure and variables are added to design B. Flattening: comb logic paths are converted to 2 level SOP, and all intermediate logic structure and variables are removed. This generic netlist has following cells: 1. SEQGEN cells for all flops/latches (i/p=clear, preset, clocked_on, data_in, enable, synch_clear, synch_preset, synch_toggle, synch_enable, o/p= next_state, Q) 2A. ADD_UNS_OP for all unsigned adders/counters comb logic(i/p=A,B, o/p=Z). these can be any bit adders/counters. DC breaks large bit adders/counters into small bit (i.e 8 bit counter may be broken into 2 counters: 6 bit and 2 bit). Note that flops are still implemented as SEQGEN. Only the combinatorial logic of this counter/adder (i.e a+b or a+1) is impl as ADD_UNS_OP, o/p of which feeds into flops. 2B. MULT_UNS_OP for unsigned multiplier/adder? 2C. EQ_UNS_OP for checking unsigned equality b/w two set of bits, GEQ_UNS_OP for greater than or equal (i/p=A,B, o/p=Z). i/p may be any no. of bits but o/p is 1 bit. 3. SELECT_OP for Muxes (i/p=data1, data2, ..., datax, control1, control2, ..., controlx, o/p=Z). May be any no. of i/p,o/p. 4. GTECH_NOT(A,Z), GTECH_BUF, GTECH_TBUF, GTECH_AND2/3/4/5/8(A,B,C,..,Z), GTECH_NAND2/3/4/5/8, GTECH_OR2/3/4/5/8, GTECH_NOR2/3/4/5/8, GTECH_XOR2/3/4, GTECH_XNOR2/3/4, GTECH_MUX*, GTECH_OAI/AOI/OA/AO, GTECH_ADD_AB(Half adder: A,B,S,COUT), GTECH_ADD_ABC(Full adder: A,B,C,S,COUT), GTECH_FD*(D FF with clr/set/scan), GTECH_FJK*(JK FF with clr/set/scan), GTECH_LD*(D Latch with clr), GTECH_LSR0(SR latch), GTECH_ISO*(isolation cells), GTECH_ONE/ZERO, for various cells. DesignWare IP (from synopsys) use these cells in their implementation. NOTE: in DC gtech netlist, we commonly see GTECH gates as NOT, BUF, AND, OR, etc. Flops, latches, adders, mux, etc are rep as cells shown in bullets 1-4 above. 5. All directly instantiated lib components in RTL. 6. If we have designware license, then we also see designware elemnets in netlist. All designware are rep as DW*. For ex: DW adder is DW01_add (n bit width, where n can be passed as defparam or #). Maybe *_UNS_OP above are designware elements. #gate level opt: works on the generic netlist created by logic level opt to produce a technology-specific netlist. consists of 4 processes: A. mapping: maps gates from tech lib to gtech netlist. tries to meet timing/area goal. B. Delay opt: fix delay violations introduced during mapping. does not fix design rule or opt rule violations C. Design rule fixing: fixes Design rule by inserting buffers or resizing cells. If necessary, it can violate opt rules. D. Opt rule fixing: fixes opt rule, once the above 3 phases are completed. However, it won't fix these, if it introduces delay or design rule violations. ------- In GTECH, both registers and latches are represented by a SEQGEN cell, which is a technology-independent model of a sequential element as shown in Figure 8-1. SEQGEN cells have all the possible control and data pins that can be present on a sequential element. FlipFlop or latch are inferred based on which pins are actually present in SEQGEN cell. Register is a latch or FF. D-Latch is inferred when resulting value of o/p is not specified under all consditions (as in incompletely specified IF or CASE stmt). SR latches and master-slave latches can also be inferred. D-FF is inferred whenever sensitivity list of always block or process includes an edge expression(rising/falling edge of signal). JK FF and Toggle FF can also be inferred. #_reg is added to the name of the reg from which ff/latch is inferred. (i.e count <= .. implies count_reg as name of the flop/latch) o/p: Q and QN (for both flop and latch) i/p: 1. Flop: clear(asynch_reset), preset(async_preset), next_state(sync data Din), clocked_on(clk), data_in(1'b0), enable(1'b0 or en), synch_clear(1'b0 or sync reset), synch_preset(1'b0 or sync preset), synch_toggle(1'b0 or sync toggle), synch_enable(1'b1) 2. Latch: clear(asynch_reset), preset(async_preset), next_state(1'b0), clocked_on(1'b0), data_in(async_data Din), enable(clk), synch_clear(1'b0), synch_preset(1'b0), synch_toggle(1'b0), synch_enable(1'b0) Ex: Flop in RTL: always @(posedge clkosc or negedge nreset) if (~nreset) Out1 <= 'b0; else Out1 <= Din1; Flop replaced with SEQGEN in DC netlist: clear is tied to net 0, which is N35. preset=0, since no async preset. data_in=0 since it's not a latch. sync_clear/sync_preset/sync_toggle also 0. synch_enable=1 means it's a flop, so enable if used, is sync with clock. enable=0 as no enable in this logic. \**SEQGEN** Out1_reg ( .clear(N35), .preset(1'b0), .next_state(Din1), .clocked_on(clkosc), .data_in(1'b0), .enable(1'b0), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b1) ); Ex: Latch in RTL always @(*) if (~nreset) Out1 <= `b0; else if(clk) Out1 <= Din1; Latch replaced with SEQGEN in DC netlist: all sync_* signals set to 0 since it's a latch. synch_enable=0 as enable is not sync with clk in a latch. enable=clk since it's a latch. \**SEQGEN** Out1_reg ( .clear(N139), .preset(1'b0), .next_state(1'b0), .clocked_on(1'b0), .data_in(Din1), .enable(clk), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b0) ); NOTE: flop has both enable and clk ports separate. sync_enable is set to 1 for flop (and 0 for latch). That means, lib cells can have Enable and clk integrated into the flop. If we have RTL as shown below, it will generate a warning if there is no flop with integrated enable in the lib. ex: always @(posedge clk) if (en) Y <= A; //This is a flop with enable signal. warning by DC: The register 'Y_reg' may not be optimally implemented because of a lack of compatible components with correct clock/enable phase. (OPT-1205). => this will be implemented with Mux and flop as there's no "integrated enable flop" in library. #Set the following variable in HDL Compiler to generate additional information on inferred registers: set hdlin_report_inferred_modules verbose Example 8-1 Inference Report for D FF with sync preset control (for a latch, type changes to latch) ====================================================================== |Register Name | Type |Width | Bus | MB | AR | AS | SR | SS | ST | ========================================================== | Q_reg | Flip-flop | 1 | N | N | N | N | N | Y | N | ====================================================================== Sequential Cell (Q_reg) Cell Type: Flip-Flop Width: 1 Bus: N (since just 1 bit) Multibit Attribute: N (if it is multi bit ff, i.e each Q_reg[x] is a multi bit reg. in that case, this ff would get mapped to cell in .lib which has ff_bank group) Clock: CLK (shows name of clk. For -ve edge flop, CLK' is shown as clock) Async Clear(AR): 0 Async Set(AS): 0 Async Load: 0 Sync Clear(SR): 0 Sync Set(SS): SET (shows name of Sync Set signal) Sync Toggle(ST): 0 Sync Load: 1 #Flops can have sync reset (there's no concept of sync reset for latches). Design Compiler does not infer synchronous resets for flops by default. It will see sync reset signal as a combo logic, and build combo logic (with AND gate at i/p of flop) to build it. To indicate to the tool that we should use existing flop (with sync reset), use the sync_set_reset Synopsys compiler directive in Verilog/VHDL source files. HDL Compiler then connects these signals to the synch_clear and synch_preset pins on the SEQGEN in order to communicate to the mapper that these are the synchronous control signals and they should be kept as close to the register as possible. If the library has reg with sync set/reset, then these are mapped, else the tool adds extra logic on D i/p pin (adds AND gate) to mimic this behaviour. ex: //synopsys sync_set_reset "SET" => this put in RTL inside the module for DFF. This says that pin SET is sync set pin, and SEQGEN cell with clr/set should be used. #Latches and Flops can have async reset. DC is able to infer async reset for flop (by choosing SEQGEN cell with async clear and preset connected appr), but for latches, it's not able to do it (it chooses SEQGEN cell with async clear/preset tied to 0). This is because it sees clear/preset signal as any other combo signal, and builds combo logic to support it. DC maps SEQGEN cell (with clr/preset tied to 0) to normal latch (with no clr/set) in library, and then adds extra logic to implement async set/reset. It actually adds and gate to D with other pin connected to clr/set, inverter on clr/set pin followed by OR gate (with other pinof OR gate tied to clk). So, basically we lose advantage of having async latch in .lib. To indicate to the tool that we should use existing latch (with async reset), use the async_set_reset Synopsys compiler directive in Verilog/VHDL source files. ex: //synopsys async_set_reset "SET" => this says pin SET is async set/reset pin, and SEQGEN cell with clr/set should be used. #infer_multi_bit pragma => maps registers, multiplexers and 3 state drivers to multibit libraray cells. #stats for case stmt: shows full/parallel for case stmt. auto means it's full/parallel. A. full case: all possible branches of case stmt are specified. otherwise latch synthesized. non-full cases happen for state machines when states are not multiple of 2^n. In such cases, unused states opt as don't care. B. parallel case: only one branch of case stmt is active at a time (i.e case items do not overlap). It may happen when case stmt have "x" in the selection, or multiple select signals are active at same time (case (1'b1) sel_a:out=1; sel_b: out=0;). If more than 1 branch active, then priority logic built (sel_a given priority over sel_b), else simple mux synthesized. RTL sim may differ from gate sim, for a non-parallel case. #The report_design command lists the current default register type specifications (if we used "set_register_type" directive to set flipflop/latch to something from library) . dc_shell> report_design ... Flip-Flop Types: Default: FFX, FFXHP, FFXLP #MUX_OPs: listed in report_design. MUXOPs are multiplexers with built in decoders. Faster than SELECT_OPs as SELECT_OPs have decoding logic outside. ex: reg [7:0] flipper_ram[255:0]; => 8 bit array of ram from 0 to 255 assign p1_rd_data_out = flipper_ram[p1_addr_in]; => rd 7 bits out from addr[7:0] of ram. equiv to rd_data[7:0] = ram[addr[7:0] ]. this gives the following statistics for MUX_OPs generated from previous stmt. (MUX_OPs are used to implement indexing into a data variable, using a variable address) =========================================================== | block name/line | Inputs | Outputs | # sel inputs | MB | =========================================================== | flipper_ram/32 | 256 | 8 | 8 | N | => 8 bit o/p (rd_data), 8 bit select (addr[7:0]), 256 i/p (i/p refers to distinct i/p terms that mux is going to choose from, so here there are 256 terms to choose from, no. of bits for each term is already indicated in o/p (8 bit o/p) ) =========================================================== #list_designs: list the names of the designs loaded in memory, all modules are listed here. #list_designs -show_file : shows the path of all the designs (*.db in main dir) -------------------------- Optimizatio pririty in DC -------------------------- Uses cost types to optimize design. Cost types are design rule cost and optimization cost. By default, highest pririty to design rule cost (top one) and then pririty goes down as we move to bottom ones. 1. design rule cost => constraints are DRC (max_fanout, max_trans, max_cap, connection class, multiple port nets, cell degradation) 2. opt cost: A. delay cost => constraints are clk period, max_delay, min_delay B. dynamic power cost => constraints are max dynamic power C. leakage power cost => constraints are max lkg power D. area cost => constraints are max area ------------------------ #terminology within Synopsys. https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcug/dcug_5.html #designs => ckt desc using verilog HDL or VHDL. Can be at logic level or gate level. can be flat designs or hier designs. It consists of instances(or cells), nets (connects ports to pins and pins to pins), ports(i/o of design) and pins (i/o of cells within a design). It can contain subdesigns and library cells. A reference is a library component or design that can be used as an element in building a larger circuit. A design can contain multiple occurrences of a reference; each occurrence is an instance. The active design (the design being worked on) is called the current design. Most commands are specific to the current design. #to list the names of the designs loaded in memory dc_shell> list_designs a2d_ctrl digtop (*) spi etc => * shows that digtop is the current design dc_shell> list_designs -show_file => shows memory file name corresponding to each design name /db/Hawkeye/design1p0/HDL/Synthesis/digtop/digtop.db digtop (*) /db/Hawkeye/design1p0/HDL/Synthesis/digtop/clk_rst_gen.db clk_rst_gen #The create_design command creates a new design. dc_shell> create_design my_design => creates new design but contains no design objects. Use the appropriate create commands (such as create_clock, create_cell, or create_port) to add design objects to the new design.

Synopsys/Standard design constraints (SDC)

SDC is a subset of the design constraint commands already supported by many CAD tools. SDC was agreed on as a standard, since diff tool vendors had their own synthesis/timing constraint cmds, which made it difficult to port these constraints. Since most of the constraints for synthesis, timing, etc are standard (i.e define clock, port delays, false paths, etc), it just made sense to have standard constraints that would be supported by all vendors. Synopsys and Cadence Synthesis, timing, PnR, etc tools support these SDC cmds.
SDC versions are 1.2, 1.3 .. 2.0. In write_sdc (in both synopsys and cadence tools), we can specify version of sdc file to write (default is to use latest version). SDC cmds started with having design constraint cmds only, but over time expnaded to include cmds pertaining to reporting, collection, get objects, etc.

It's been 2022, and still no one including me, knows the full form of SDC !! Synopsys, which had these cmds initially in their synthesis/timing tools, allowed them to become a standard. Cadence and other companies grudgingly accepted it, and called it Standard design constraints (but don't mention the full form anywhere). But Synopsys website still refer to these as "Synopsys Design constraints".

Before we learn these cmds, let's go over few basics of design that these cmds work on.

Objects:

Objects are anything in design like ports, cells, nets, pins, etc. Valid objects are design, port , cell, pin, net , lib, lib_cell, lib_pin, and clock. Each of these objects may have multiple attribute. As an ex, each gate may have 100's of attributes as gate_name, size, pins, delay_arcs, lib_name, etc. These objects are further put into several class as ports, cells, nets, pins, clocks, etc. Most commands operate on these objects.

Collections:

Synopsys applications build an internal database of objects and attributes applied to them. Cadence applications also build a similar internal database. but the internal representation may be different, and process to access them may be different. So arguments of some of the sdc commands may be different across different vendors, even though they may support the basic sdc cmd. Same sdc file in Synopsys may not be used directly in cadence tools, as many of these collection cmds (cmds that work on collection of objects) may have some part of code (cmds that were used to make a collection) that may not be recognized by Cadence tools as valid. The situation is improving, but this is something that needs to be kept in mind. Always read the SDC manual of a vendor to find out what sdc cmds and syntax it supports.

Definition: A collection is a group of objects exported to the Tcl user interface. Collections are tcl extension provided by EDA vendors (Synopsys/Cadence) to support list of objects in their Tcl API. Most of the design cmds work on these collection of objects. In Tcl language, we have 2 composite data types, "list" and "array", that allows us to put multiple elements into a single group. Collections can be thought of as similar to Tcl list, though they are not interchangeable, as internals of the 2 are different. However, many cmds take both list and collection as i/p, not differentiating bwteen the 2. This is done for user convenience. However, internally the list is converted to a collection, and o/p of the cmd is always given out as collection. Visually collections look like list (when displayed on the screen), but they are NOT list.

A set of commands to create and manipulate collections is provided as an integral part of the user interface. The collection commands encompass two categories: those that create collections of objects for use by another command, and other that queries objects for viewing. These two types of collection cmds are:

1. create/manipulate objects: add_to_collections, append_to_collection, remove_from_collection, sizeof_collection, foreach_in_collection, sort_collection, compare_collection, copy_collection, filter_collection, etc are few common collection cmds. These cmds work on collection of objects to manipulate that object list and create new collection of objects. As such, these cmds may be used as i/p to other cmds that expect collection of objects as it's i/p. Collections return a pointer to that collection (i.e 0x78 is what you see on the screen when creating a collection), which is used by other cmds. In all these collection cmds below, the collections provided to the cmd themselves remain unchanged: A new collection is created which can either be passed to other cmds, or may be assigned to a var, which becomes the new collection.

  1. create/remove collection: Collections can be created by starting with an empty collection, and using "add_to_collection" or "append_to_collection" (both being similar, although append is more efficient in some situations as per PT manual). Adding an implicit list of only strings or heterogeneous collections to the empty collection generates an error message, because no homogeneous collections (collection of same class, i.e either all ports, or all cells, etc) are present in the object_spec list. Finally, as long as one homogeneous collection is present in the object_spec list, the command succeeds, even though a warning message is generated. If the base collection to which we are adding the new collection is not empty, then heterogenous objects (i.e ports, cells) from second collection are added to the base collection only if base collection also has heterogenous objects. So, rules get complicated, and to see what got added/deleted, we should always  instead of relying on add or append cmd, it's easier to just make a collection using get_ports, get_cells, etc which implicitly make a collection of items. 
    1. add_to_collection: Creates a new collection by concatenating objects from the base collection and the second collection. base_collection needs to be a collection, while second collection may be collection or object list. The base collection and second collection remain unmodified, i.e these collections are appended and new collection is made which can either be passed to other cmds, or may be assigned to a var, which becomes the new collection. This cmd returns AUB (i.e union of A and B)
      • Syntax: add_to_collection <base_coll> <second_coll> => -unique option removes the duplicate objects from the new collection. 
        • ex in PT: pt_shell> add_to_collection "" [list port1 port2] => This adds design objects "port1" and "port2" to the empty collection and creates a new collection which has no name, and if not passed to a cmd, does nothing. We can create a new coll, by giving it a name as => set var1 [add_to_collection "" [list port1 port2]]
        • pt_shell> set port_coll [list port1 port2] => Here new collection port_coll  is formed which has the 2 ports in that coll. These ports may or may not be valid. Here we didn't have to explicitly create a collection as the o/p of [...] is converted to a  collection, the pointer to which is set to port_coll. echo $port_coll returns a pointer as 0x87 (in Cadence tools) or _sel3555 (in Synopsys tools). However on screen it shows {"port1", "port2"}. We could also use get_ports cmd whose o/p is a port list collection. That's a preferred way as that guarantees correct collection being provided as i/p to collection list. i.e
        • pt_shell> set port_coll [get_ports [list port_a port_b[*]]  => Here get_ports o/p is a coll of => {"port_a", "port_b[1]", "port_b[1]"}. We set this coll to port_coll
        • pt_shell> set port_coll [list $port1 $cell2] => here port1, cell2 need to be valid object names or collection.  may give an error as variables $port1 $cell2 might be pointers to collections and list , etc .One way to guarantee that is to have these var from the o/p of get_port, get_cell or similar cmds which return collection as o/p. If not, we get an Error in PT => At least one collection required for argument 'object_spec' to add_to_collection when the 'collection' argument is empty (SEL-014)
    2. append_to_collection: Since add_to_collection doesn't modify the original collection, this cmd allows you to append the original coll, which is useful in many cases. This command can be much more efficient than the add_to_collection command if you are building up a collection in a loop.
      • Syntax: append_to_collection var_name <obj_or_coll> => -unique option removes the duplicate objects from the new collection. NOTE: var may or may not be defined, may be a existing coll, or an empty coll. If the var does exist and it does not contain a collection, it is an error. We do NOT have $ in front of var name, as we are modifying or defining this var (we are not accessing the value of the var). This var becomes a coll once a coll is appended to it.
        • ex: append_to_collection my_ports [get_ports in*] => Here my_ports is a var, to which [get_ports in*] coll get added.
    3. remove_from_collection: To remove elements from collection.  Any element in 2nd coll is removed from base_coll. Similar to "add_to_collection", base collection and second collection remain unmodified. A new collection is created, which can be assigned to a var. Here, collections can't contain any list of objects, which is how add_to_collection behaved (it allowed lists as option)
      • Syntax: remove_from_collection <base_coll> <obj_coll> => o/p is A - (A∩B) since objects in <obj_coll> are removed from <base_coll>. -intersect option does the opposite => it removes obj in base coll that are not found in <obj_coll>. So, with this option, instead of providing o/p as A - (A∩B), it's A∩B (that's why it's called intersect). So, with remove and add cmd, we can find out all combo => AUB, A∩B, A - (A∩B) and B - (A∩B)
        • ex: remove_from_collection [concat [get_cells *] [get_ports *in*]] [get_cells I_*] => removes specified cells from coll of cells and ports
        • ex: set dports  [remove_from_collection [all_inputs] CLK] => all_inputs creates a collection of all i/p ports from which CLK is removed. T/he remaining collection pointer is assigned to var dports.
  2.  compare_collections: To compare 2 coll if they have same elements or not, we use this cmd. 0 indicates success (i.e same elements), while any other number indicates failure (diff elements). This behaviour is consistent with "string compare" in tcl, which returns 0 on match.
    • Syntax: compare_collection <coll1> <col2l> => With -order_dependent option, the collections are considered to be different if the objects are ordered differently.
      • compare_collections $c1 [get_ports out*] => Here c1 coll and ports coll are compared, and if they have same contents (order doesn't matter), then 0 is returned.
  3. foreach_in_collection: To iterate over the objects in a collection, use the foreach_in_collection command. You cannot use the Tcl- supplied foreach iterator to iterate over the objects in a collection, because the foreach command requires a list, and a collection is not a list. The o/p of any cmd that returns a collection is actually a pointer to that collection. The arguments of the foreach_in_collection command are similar to those of foreach: an iterator variable, the collection over which to iterate, and the script to apply at each iteration.
    • Ex: set A [get_ports] => returns these {"port[0]", "IN2", "SPI_CLK0"} => Object names are displayed just for readability. A still holds the pointer value. echo $A => returns a pointer _sel3555
    • foreach_in_collection tmp $A {echo [get_object_name $tmp]} => returns these  3 ports = port[0] IN2 SPI_CLK0. echo $tmp will return pointer value _sel3555 3 times. This is because tmp is just assigned the pointer value
    • foreach_in_collection tmp _sel3555 {echo [get_object_name $tmp]} => we get same result as above, since A is just storing the pointer to collection
    • create_fillers -lib_cells $A => These collections ($A) work just fine in cmds which expect to get a collection pointer as argument. No need to iterate thru each element of collection. Many cmds accept both collection as well as list for argument. So, no issues with either one.
    • set_false_path -to [get_pins mod1/*] => Here get_pins returns a pointer which points to a collection with all pins of mod1. Since set_false_path takes objects in it's args, it takes this collection of pins as an argument, and sets false path to all the pins, i.e set_false_path -to {mod1/pinA, "mod1/pinB[0]", ...}. However, if we want, we can iterate thru each pin of "get_pins" cmd using foreach_in_collection loop.
      • ex: foreach_in_collection tmp [get_pins mod1/*] {set_false_path -to [get_object_name $tmp]} => Here each object in collection is assigned to var "tmp" one by one, and then false path set to each of them separately. i.e set_false_path -to mod1/pinA, set_false_path -to "mod1/pinB[0]", ... => see how set_false_path is done separately for each object of collection. So, collections can be used as one, or can be divided into individual elements.
    • get_ports _sel3555 => since _sel3555 is a pointer to collection, cmd get_ports gets collection of ports from the collection pointed by _sel3555. That returns collection of ports similar to what "get_ports" returns by itself. This is just a convoluted way to show that it works.
  4. sizeof_collection: Very useful cmd to figure out the size of collection w/o iterating thru the whole list. To find the size of collection, one way is to use foreach_in_collection cmd above. We increment a counter inside the loop, and when that loop exits, the counter shows the number of elements in the collection. sizeof_collection provides an easier way to do that by using this single cmd instead of a loop. Sometimes collections will be empty, and in such cases, we want to know beforehand. Otherwise our collection cmds will error out since all these cmds expect valid collection pointer. In such cases, sizeof_collection is useful too.
    • ex: if { [sizeof_collection [get_ports -quiet SPI*]] != 0} { foreach_in_collection tmp [get_ports -quiet SPI*] { do operation } } else { echo "collection empty" } => If we directly used foreach_in_collection on an empty list, then the tool would report an error saying the collection is empty. We avoid that by using if else.
  5. filter_collection: Most used cmd to filter objects from any collection based on specified filtering criteria. It allows objects to be filtered out before they are ever included in the collection. An alternative -filter option for many cmds is also provided for many apps, see details in later section. 
    • ex: filter_collection  [get_cells *] "is_hierarchical == true && ref_name=~*AN2*" => get_cells creates a collection of all cells in design, and then filters out those whose hierarchy is set to "true", and the reference cells have pattern AN2 int hem. So, it finally gets all AND2 leaf cells. NOTE: double quotes are only at start and end of filtering (not with each filter expr)
    • ex: filter_collection $coll -regexp "full_name =~ ^${softip}/eco_\[a-z\]+_icd/.* or full_name =~ ^${myip}/.*" => here regex used, so we use .* instead of *. Also, keywords "and", "or" may be used isntead of &&, ||, etc. $coll is a collection generated from o/p of some prior cmd.

2. query objects:

We can do "echo" of a list, and it will print the list, but with collection, a simple "echo" won't return the list. When we do "echo" on collection, we only get a pointer to collection. However, on screen, we do see o/p similar to echo of list (where it's listing names of objects in collection), whenever we run any cmd that outputs a collection. This is done by vendor tools just for convenience purpose. CAD tools by default call an implicit "query_objects" cmd, whenever any cmd that outputs a collection is run. A default limit of 100 is set as the max number of objects that can be displayed (controlled by a variable "collection_result_display_limit" whose default value of 100 can be changed to anything we want). Though this works for most viewing purpose, we can't use this printed o/p within a script as the cmd itself just returns a pointer to collection and not a list of objects in the collection. In order to see what's stored in collection, we can also use a built in collection proc "query_objects". "query_objects" takes as i/p either a collection or a pattern. By default, query_object displays names of objects (-verbose option displays class of each object as cell, net, etc too). Again, this is also for display purpose only, and it's o/p can't be used in scripts, as it always returns an empty string. For getting names of objects to be used in a script, we have to use function like "get_object_name" on the collection. There are many other cmds for getting other attributes of each object in the collection.

  • ex: query_objects [get_cells o*] => ["or1", "or2"]. get_cells returns a collection which is then passed to query_objects which gets the names and outputs it as a list. Here query_objects is passed a collection as it's i/p. Here o/p is in default legacy format
  • ex: query_objects -class cell U* => [U1 U2]. When i/p to query_objects is a pattern, we have to specify a class as cell, net, etc (classes are application specific). Here o/p is in tcl format (i.e tcl list with no commas etc), which is possible by setting thisapp var => set_app_var query_objects_format Tcl

All below cmds expect collection, and so can only be passed a pointer to collection. Names or patterns as options will give an error. This is diff than query_objects cmd above which can take patterns as an i/p too.

  1. get_object_name => convenient way to get the full_name attribute of objects in a collection. When multiple objects passed as i/p, o/p is returned as a list. Cadence supports this as this cmd was added in SDC for compatibility while reading in SDC files with non-SDC constraints. However, it just returns specified i/p args provided.
    1. ex: pt_shell> get_object_name [get_cells i*] => ["inx1", "ind2"]. get_cells returns a collection which is then passed to get_object_name.
    2. ex: get_object_name "in1/cell2" => errors out since "in1/cell2" is not a collection pointer. To make it work, do: get_object_name [get_cells in1/cell2]

-------------

 

Liberty file format: (.lib): These are standard files for representing timing info for stdcells as gates, flops, etc. They contain all arcs for all stdcells, as well as functionality of these stdcells. That is why synthesis tools are able to map RTL to gate, by using this functionality information for all stdcells present in these liberty files. They use timing info from these files to figure out optimal gates to meet timing.

The most common liberty files in use are the ones used for higher node tech ( >22nm). These have simple look up table (LUT) delays specified for all cells.  This is the conventional NLDM (non linear delay model) based. The other more accurate one is CCS (composite current source) model which is employed for tech 22nm and below to give accuracy within 2% of spice simulations. CCS will be discussed later.


syntax:

A very good resource is the official Liberty user guide and reference manual uploaded here: liberty.pdf

General syntax of a test.lib file is as follows:

1st stmt names the library. stmts that follow are library level attributes that apply to the whole lib, as tech type, defn, defaults, etc. then every cell in lib has separate cell description.


stmts are buliding blocks of lib.  4 types:
1. group stmt: {} used to enclose contents of group. Ex:
  pin(A) {
    related_pin: B; //pin group stmt

   cap1_rise (cap_template) { index1 ("..."); values (" ..."); } // groups nay be nested recursively here
  }

2. Attribute stmt: attribute_name: attribute_value; => attribute value sometimes enclosed in double quotes. Attributes explained in detail later.
pin (A){
 direction : output;
 function : "X+Y"; => this is used by synthesis tool, to figure out which gate to use for given RTL logic.
}

3. define stmt: to create new attribute. syntax is: define (attr_name, group_name, attr_type);
Ex: to define a new string attribute called bork, which is valid in a pin group, use
define (bork, pin, string) ;
You give the new attribute a value using the simple attribute syntax:
bork : "nimo"

4. wire load: define the estimated wire length as a function of fanout. You can also define scaling factors to derive wire resistance, capacitance, and area from a given length of wire.
wire_load("3K_2LM") { //name => implies it's for 2 metal layer and for design whose size is < 3K.
    resistance : 0; //res in ohms/unit length. Res=0 implies no resistance.
         capacitance : 1; //cap in cap_unit/unit length. Note unit is in pf, so cap=1pf/unit is
         area : 0; //area/unit length
         slope : 0.0118413; //characterizes linear fanout length behavior beyond the scope
of the longest length described by the fanout_length attributes.
         fanout_length(  1, 0.005469 ) ; //for fanout=1, estimated wire length is 0.005 units
         fanout_length(  2, 0.00943588 ) ;
         ....
         fanout_length(  19, 0.259363 ) ; //for fanout=19, estimated wire length is 0.26 units (for linear interploation, wire length for FO=19 = 0.005*19=0.1, so actual wire length is higher than linear interploation.
    }

wire_load("3K_3LM") {//name => implies it's for 3 metal layer and for design whose size is < 3K. similarly for 3K<6K, 6K<16K, so on.
         resistance : 0;
       ...
  }

#wire load selection criteria is given below which selects from one of the wire load models above
wire_load_selection (2LM) {
                  wire_load_from_area (0, 3000, "3K_2LM" ); => specs that if 0 < area_of_design < 3000, choose 3K_2LM wire load model.
          wire_load_from_area (3000, 6000,  "6K_2LM" ); => choose 6K_2LM for 3000 < area_of_design < 6000. 6K_2LM usually has longer length for a given FO than 3K_2LM, as bigger the design, longer the nets for a particular FO. Simlarly 6K_3LM has lower length for a given FO compared to 6K_2LM as 3 metal layers provide more routing resource, so longer wires not needed.
 }
wire_load_selection (3LM) {
       ...
 }

  default_wire_load      : "6K_3LM"; => by default, wire_load(6K_3LM) is chosen.
  default_wire_load_selection   :  3LM ; => by default, wire_load_selection(3LM) is chosen, and within this 6K_3LM is chosen.

5. include_file(file_name); => This includes that file from the dir specified in search path.
 

6. fanout_load: this specifies fanout load for each i/p pin of cell. If not specified, default_fanout_load defined at top of lib file is used.
This may be some number as 1 for smallest size gate (invx1), and then defined appropriately for bigger gates. This will be used by synthesis tool, when we specify max_fanout_load, then all the fanout_load attached to the o/p pin are added to calculate total fanout_load.

7. function: used to represent function of o/p pins of a cell
function : "A&B"; => rep that o/p pin is AND of i/p pins.

Note that simple combinatorial gates can be represented by "function:" stmt, but with seq logic, it's not easy. For latches/flops, we use special keywords. In .lib file, "latch" group used to describe latches and "ff" group used to describe flops. In GTECH (during synthesis in Synopsys DC), both registers and latches are represented by a SEQGEN cell, which has many i/p and o/p pins. Any type of flop/latch can be configured from this SEQGEN cell by tying it's various inputs and outputs.



Example.lib:  The lib example below can be applied to any cell, an std cell, or an IP e.g memory module. Just as for a std cell, we specify setup/hold arcs or delay arcs for all i/p pins, we do the same for an IP lib file for all it's i/p pins. When lib files are created for IP, they are called as ETM (extracted Timing model). These ETM hide the internal details of an IP, and just show the arcs on all i/p and o/p pins. These ETM are also used in big SOC, since over there, we run timing on block level, and then when moving to higher level, we generate ETM models of these lower level blocks. That way STA runs much faster at higher module level. We finally take this approach all the way to chip level, where all top modules in it are ETM. this allows STA to run in a much faster time. In some cases where SOC have 10B+ transistors, it's not even possible to run STA flat on chip level gate netlist, since it will take weeks to complete. On other hand, top chip level runs with ETM of lower level blocks can run in less than a day.

library (LIB_W_150_2.5_STDCELL.db) { /* name of library. name can be with .db or w/o it. entire lib desc, lib level attr desc below */

  /* genral lib attr */
  technology (cmos); /* tech tools used, default name is cmos*/
  delay_model : table_lookup; /* which delay model to use in delay calc. generic_cmos is default, which is simplest model. 4 others arr table_lookup, piecwwise_cmos, dcm, polynomial. table_lookup is most common. table_lookup is aka Non linear Delay model (NLDM), and this is the one which is shown in this example below*/
  bus_naming_style : "Bus%sPin%d"; /* naming convention for buses */
  routing_layers ("routing_layer_1, routing_layer_2"); /* all routing layers available for PnR */

  /* delay and slew attr */
  //define varios slew and delay attr like thresholds for measuring delay and slew ..
  input_threshold_pct_fall : 46; // threshold of 46% fall at i/p pin of receiver for measuring delay
  input_threshold_pct_rise : 46;
  output_threshold_pct_fall : 46; // threshold of 46% fall at o/p pin of driver for measuring delay
  output_threshold_pct_rise : 46;
  slew_lower_threshold_pct_fall : 20; //slew starting point is at 20% rise/fall
  slew_lower_threshold_pct_rise : 20;
  slew_upper_threshold_pct_fall : 80; //slew ending point is 80% rise/fall. This start/end points are used to get the linear slope of waveform
  slew_upper_threshold_pct_rise : 80;

  /* define units */
 time_unit: "10ps"; /* to identify physical time unit in lib. most common is 1ns*/
 voltage_unit: "100mv"; /* to scale i/p, o/p voltage groups. most common is 1V*/
 current_unit: "1mA"; /* drive current unit genrated by o/p pads, or pull-up/pull-down transistor */
 pulling_resistance_unit: "10ohm"; /* res for pull-up/pull-down transistor */
 capacitive_load_unit (1,pf); /* unit for all caps*/
 leakage_power_unit: 100uW; /* unit of power values. Power units are usually not reported, and calcualted from V, I, C. However, lkg is added for Synopsys DesignPower*/

 voltage_map (VDD, 0.5);  //These map var VDD to 0.5V. Similarly map other voltages as VPP, VBB, VSS. These mappings are needed if these var are used later.
  ...

default values /* env defn*/
  nom_process : 3;
  nom_temperature : 150;
  nom_voltage : 2.5;
  default_fanout_load : 1; => by default, each i/p pin assigned a fanout load of 1. we override this by assigning explicit FO on each i/p pin of all cells (by using fanout_load : 1;)
  default_max_fanout : 20; => max_fanout set at 20 for all o/p pins. we don't specify it explicitly for o/p pins except for tie_hi/tie_lo pins of TIE cell.
  default_input_pin_cap : 1; => default is 1 unit. however, each i/p pin assigned explicit cap (by using capacitance : 0.004)
  default_inout_pin_cap : 1;
  default_output_pin_cap : 0; => default is 0 unit. however, each o/p pin assigned explicit cap (which is again very close to 0, as src/drn cap is negligible)

  operating_conditions (W_150_2.5) { //just one op cond specified for particular lib. name is W_150_2.5 (W=weak, T=150C, V=2.5V) but can be anything as "SlowSlow_0p9v_m25c". Here op cond is WCCOM (worst case cond). Usually there is only 1 op cond specified in single lib, but there may be multiple too, in which case we choose the one we want. Other op cond BCCOM (best case cond) may be defined in some other lib. This section is used in PT/synthesis to set operating condition. More details on "set_operating_condtion" specified in "PT - OCV" section. 
    process : 3; => Process is usually defined as a number where some process number=nom. Any number below nom is considered fast process, while number above nom is considered slow process.
    temperature : 150; => This defines Temperature for this lib
    voltage : 2.5; => This defines voltage for this lib
    tree_type : "balanced_tree";  //interconnect model for calc interconnect delay. During Synthesis, "compile" cmd uses the model from here to select a formula for calc interconnect delays. 3 models available: best_case_tree (uses lumped RC model), worst_case_tree (all loads assume full wire resistance) and balanced_tree (all loads share wire resistance evenly). Here, model is "balanced_tree".

  voltage_map(VDD_HIGH, 0.540); => This is latest liberty cmd, that is used to map voltage for PG(power /  ground) pin of block. This is the voltage that this PG pin is mapped to for this corner. The flow defaults to this voltage, when no other voltages are set on this pin. It also issues error/warning, when the voltage set on this pin, is not within a certain range of this voltage. We can specify voltage map for all PG pins of this block. NOTE: operating_condition also specifies voltage for a block, but it specifies for whole of the block, not for each indvidual power pin of block. More usage of this is explianed in "PT - DSLG flow" section.


  delay_lut_template (name) { //name may be delay_template_5x6 or something descriptive. There may be multiple of these lut for power, driver_waveform, etc
   //lookup table template info. Below info says that when table is 2D with 2 indices, then 1st index ins i/p cap, while 2nd index is i/p slew, and the value reported in lut is the "delay" va;ue corresponding to this i/p cap and this i/p slew.
    index_1 ("1,2,3,4,5"); //index values here may be real values too
    variable_1 : total_output_net_capacitance; //o/p net cap used for table look up. variable1 corresponds to index1
    index_2 ("1,2,3,4,5,6");
    variable_2 : input_net_transition; //i/p net transition on that pin used for table look up. variable2 corresponds to index2
NOTE: each row in the 2D table reported later is for index_1 (so 5 rows), while each column in the 2D table refers to index_2 (so 6 columns) for the entries that we see in table. var1 and var2 may be other way around too
  }

  //wire load models
  wire_load("3K_2LM") { ...}
  wire_load("3K_3LM") { ...}

  wire_load("zwlm") { resistance: 1; capacitance:0; fanout_length(1,0) ... } //this is zero wire load model which says that res=1ohm and cap=0pf per unit length of wire, and for fanout=1, assume length to be 0, FO=2, assume length to be 0 and so on until FO=20. So, essentially. RC delay is going to be 0 for all wires, as wire length is assumed to be 0 for all connections
  ...
  wire_load_selection (2LM) { wire_load_from_area(0, 110300, "zwlm"); } //each of these wire load selection chooses a particular wire load model from above based on area of design. here it says that if area of design is between 0 and 110300 units then choose zwlm.
  wire_load_selection (3LM) { ... }

  default_wire_load      : "6K_3LM";
  default_wire_load_selection   :  3LM ;
  deafult_wire_load_mode: segmented ;


 //////// All cells power/delay data ////////
  cell (name1) { /* cell defn */
    //general info for each cell. All these attributes are defined by liberty syntax. We can have as many attributes for each cell as we want.
    version : 1.0;
    cell_leakage_power : 4.579760E+01;
    area : 1.25;
    cell_footprint : AN2;

    pg_pin (VDD) { pg_type: primary_power; voltage_name:VDD; related_bias_pin: VPP; } => optional. All pg_pins as VDD, VSS, VPP, VBB specified here

    //optional: lkg pwr for each combo of i/p pins. default lkg pwr is the one above (when none of below conditions occur). This is needed only if we want to model very accurate leakage power (<22nm)
    leakage_power ()  {
      value : 4.112860E+01;
      when : "A&!B"; //lkg pwr when A=1,B=0. similarly we define for other combo of A,B

      related_pg_pin: VDD //if we have multiple Power pins, then we can define power consumption for each pin separately. If we have bias voltage for nwell as VPP, then we have separate lkg power related to that pin.
    }
    
    //info for each i/p pin.
    pin (A)  { //similarly for pin B and other i/p pin
      capacitance : 0.0027; //i/p cap on pin A (may have rise_cap and fall_cap also listed separately for low tech node (<22nm), however, rise/fall cap are very close to regular cap of pin)

      receiver_capacitance () { //apart from simple values above, we can specify i/p slew dependent cap values to be used in receiver model in CCS model. More details in CCS section.

         when: "!B&SI"; //cap can be different based on i/p pin state. So, we can condition based cap

         receiver_capacitance1_rise (receiver_cap_template_8x8) { //we have 4 such values for cap1_rise, cap1_fall, cap2_rise and cap2_fall
        index_1 ("0.00340741, 0.0126433, 0.031115, 0.0681482, 0.142125, 0.290168, 0.586164, 1.17816"); //i/p slew
        values ( \
          "0.000288178, 0.000321005, 0.000330344, 0.000333917, 0.000335357, 0.000336019, 0.000336375, 0.000336634" \ //cap1_rise values for diff i/p slew rate
        );
      }

      max_transition : 4.00; //max transition tolerated on i/p pin A. this max transition is there since the timing table for o/p pin has look up values upto tran time of 4ns. Any trnasition greater than 4ns has to be extrapolated by the timing tool to come up with delay for the cell, which may be inaccurate.
      direction : input;
      fanout_load : 1; //this pin assigned FO=1. This number is used by tool to estimate wireload for net connecting to this pin. This FO is also used to calc total FO load on each net for max FO Design rule violation. For bigger gates, we may assign FO=2,3,etc.

     related_power_pin: VDD; //when we have pwr/gnd pins, we assign related power, gnd and bias pins (3 separate stmt)

      //internal_power arcs for i/p pin usually don't exist, since internal pwr is already captured in o/p pin. But when we have multiple i/p pins, it's possible that some internal pwr gets consumed, when i/p pin changes even when o/p pin doesn't change. This happens due to redistribution of cap on internal nodes, due to i/p pin switching. Note that if o/p pin toggles due to i/p pin toggling, then it gets reported as internal pwr on o/p pin. Pwr consumed here is small, so most libs do not care about internal pwr on i/p pins of stdcells. Only used for lower nm tech where we want to model power accurately
      internal_power ()  { //when pin A is toggling. similarly for pin B.
        when : "!B"; //this when condition is necessary, since this internal pwr only gets consumed for NAND gate when other pin=0. This forces o/p pin to 1. So, pin A toggling doesn't cause o/p to change in this case, resulting in internal pwr on pin A only
        related_pg_pin: "VDD"; //if we have multiple pwr pins like VDD, VPP, then we define pwr separately for each pg_pin, so that we can separate out current thru each of these pins. So, for 2 PG pins as VDD, VPP, we repeat this internal power table for pin A 2 times
        rise_power (inpower_template_8x1)  { //there is only index_1 which has i/p transition time on it. There is no index for cap here
         }    
        fall_power (inpower_template_8x1)  {
        }
    }

    //info for o/p pin
    pin (Y)  {
      //general info
      capacitance : 0.0000; //drn cap on o/p pin is 0 (we may also omit this)
      max_capacitance : 0.15; //max cap tolerated on o/p pin Y. this max cap is there since timing table for o/p pin has look up values upto max cap of 150ff. Any cap load of greater than 150ff has to be extrapolated by the timing tool to come up with delay for the cell, which may be inaccurate. We may also specify a min_cap which refers to the smallest cap present in LUT
      direction : output;
      function : "A&B"; //used by tools to know functionality of cells !=NOT, +=OR, &=AND
      power_down_function: "!VDD + !VPP + VSS + VBB"; // This says that cell si powered down when VDD=0 & VPP=0 & VSS=1 & VBB=1 (0=not present, 1=present)

      related_bias_pin: VPP; //when we have pwr/gnd pins, we assign related power, gnd and bias pins (3 separate stmt)


      //timing arcs for o/p pin rise/fall delay and rise/fall transition wrt to all i/p pins
      timing ()  { //timing wrt to i/p pin A. similarly for timing wrt i/p pin B
        transport : "NO";
        related_pin : "A";
        timing_type : combinational;
        timing_sense : positive_unate;//+ve means o/p goes in same dirn as i/p

        when: "A1&!A2"; //optional, specifies that is timing arc is to be used when A1=1,A2=0. We also specify a sdf condition (sdf_cond: "A1==1'b1 && A2==1'b0") that is used when generating sdf file.
       mode(my_mode, "scan_2"); //optional. We can specify each tiing arc to be valid for specific conditions only. We achieve this via "mode" attribute. A mode attribute pertains to an individual timing arc. We specify a mode_name and mode_value, and this timing arc is active only when mode is set to that value. Here, we set our variable "my_mode" to value="scan_2", so this timing arc will be picked only when "my_mode" is set to "scan_2" mode. Here my_mode is not just a variable, that can be set via "set my_mode scan_2", but rather a mode variable, set via PT cmd "set_mode" in synthesis/STA scripts. See details of this cmd in PT cmds section.
        rise_transition (transitiondelayload6slew7_6x7)  {  } //NLDM LUT for o/p slew, similarly for fall_transition
        cell_rise (celldelayload6slew7_6x7)  {  } //NLDM LUT for o/p delay, similarly for cell_fall
      }
      timing ()  { //timing wrt i/p pin B
      }

      //power arcs for o/p pin rise/fall wrt to all i/p pins. arcs similar to those of timing
      //assumption is that both pins will never change at exactly the same time. so we can calc power wrt 1 pin toggling, then wrt other pin toggling
      internal_power ()  { //when pin A is toggling. similarly for pin B.
        related_pin : "A";
        related_pg_pin: "VDD"; //if we have multiple pwr pins like VDD, VPP, then we define pwr separately for each pg_pin, so that we can separate out current thru each of these pins. So, for 2 PG pins as VDD, VPP, we repeat this internal power table for pin A 2 times
        rise_power (outputpower_cap4_trans5)  {
        }    
        fall_power (outputpower_cap4_trans5)  {
        }

      internal_power ()  { //power when pin B is toggling    
      }
    
  }
      //internal power can be for i/p pins as well as o/p pins as we saw above. For std IP as SRAM, etc we have internal power for i/p pins instead of o/p pins as power for IP varies based on whether it's enabled, and whether in rd/wrt mode. This power number accounts for all the power for that IP in various modes
 ex:
    pin (CLK) { ...
        internal_power() {
          power_level : "VDD";
           when : "(WZ&!EZ)"; //similarly power for other modes as wrt=(!WZ&!EZ), idle=(EZ)
           power(inputpower_slew3){
         index_1("0.008,0.1500,0.600"); //i/p pin "CLK" slew rate (0.6ns is max slew rate)
         values(\
        "49.327, 49.315, 49.325"); //energy in pJ for whole IP when in rd=(WZ&!EZ)

              }
        }
        
  //cell info for other cells
  cell (name2) { /* cell defn */
   cell1 info
  }
  type (name) {
   bus type name
  }
  input_voltage (name) {
   input voltage information
  }
  output_voltage (name) {
   output voltage information
  }

 


 

INTERNAL PIN: apart from i/p and o/p pins, we can define internal pins also. This is needed in cases, where there's a complex IP, and it has clocks generated internally that time i/p and o/p ports. In such cases, we define internal pin, which is some divided version of i/p clk, and characterize it's timing based on i/p clk rise/fall. We can have all timing arcs here as setup/hold delay arcs as well as min_pulse and min_period arcs, etc.

ex:

    pin("clk_pll_checkpin_int") {
      direction : internal ;
      clock : true ;
      capacitance : 0.000000 ;
      timing() {
        related_pin : "clk1_ext" ; //this is the i/p clk pin of the IP, which serves as the master source of this internal clock pin. We define timing for gen clk wrt master clk, so that gen clk can be timed correctly based on master clk i/p slew
        timing_type : combinational ;
        cell_rise (....);
      }

     timing() {
        related_pin : "clk_pll_checkpin_int" ; //this is related to itself as min_pulse_with/min_period types are defined on the pin itself
        timing_type : min_pulse_width ;
        rise_constraint (....);
      }

  pin("IN1") { direction: input; ... timing() {
        related_pin :"clk_pll_checkpin_int"; //Here i/p port IN1 has timings related to the internal clk defined above.

  ...... }

We can also use "generated_clock" directive to define internal generated clocks. This is so that we don't have to write "create_generated_clock" cmd ourselves to create internal generated clocks. This may be useful in some cases. However, more often we remove these internal clocks inside the phy, and it's preferred to write your own "create_generated_clock" cmd to create clks inside the phy. That way we have more control on what we want.

ex: generated_clock(my_int_clk) { /* This internal clk is defined as div by 2 of master clk, and it's defined on "port1_clk" pin of the IP. There still needs to be a path via "arcs" from gen_clk to master clk, for this gen clk to be created, else PT will give PTE-075 error "gen clk has no path to master clk"*/
      clock_pin : port1_clk ;
      master_pin : ext_800m_clk ;
      divided_by : 2 ;
    }
 

CHECKPIN: Timing tools as PT creates it's own internal pins for certain arcs even when the .lib beingread doesn't have any internal pins with that name. It creates an internal pin with name "*checkpin*" whenever a pin has a combinational and sequential delay timing arc. This is done to separate the two types of arcs. For ex: consider a cell which has a clk->q arc and a combo clk->gated_clk. Here 1st arc is seq, while 2nd arc is combo. We could have written both arc with related pin as "clk". But PT chooses to create a "checkpin" for seq arc, where clk->q is now referenced as clkcheckpin1->q along with other setup/hold seq arcs also referenced wrt checkpin. clk->gated_clk is still referenced wrt original "clk". a new combo arc from clk->clkceheckpin1 is created with 0 delay. All of this internal "checkpin" creation is done when reading in .lib or .db. So, don't be surprised if you see arcs referecing checkpin, when you no such internal clks. It's something peculiar to PT only.

Details of this is on solvnet => https://solvnetplus.synopsys.com/s/article/Internal-Checkpins-Created-in-Some-Library-Cells-1576002481225

 


 

Attributes:

As we saw above, we have various attributes for cells, pins, etc. 1 of the most important attribute in "timing" group is "timing_type" attribute. It's used by timing tools to determine timing paths. timing_sense attribute is used along with this. Also, we have related and constrained pin concept that these attr apply to:

Constrained pin: This is the pin which is being constrained. When we write timing arcs, this is the -to pin. For ex: EN pin of a clk gater is a constrained pin. This is the pin that you will see in .lb as "pin(PIN_1) { ... }

Related pin: Any constrained pin may be constrained wrt multiple pins. When we write timing arcs, this is the -from pin. For ex: EN pin of a clk gater may be constrained wrt clk pin, wrt to clear pin, wrt to st pin, etc. All of these pins as clk, clear, set, etc are called related pins. These are the pins that appear within constrained pin section in .lib as "timing() { {related_pin : "clk"; ... } {related_pin : "set"; ... } } etc.

1. timing_sense attribute can be unate or non_unate. unate is when o/p dirn is dependent on i/p direction (i.e inverter o/p is always opposite of inverter i/p). Non_unate is when o/p dirn has no relationship to input dirn (i.e fop o/p pin Q can be rising or falling with no relationship to i/p pin D dirn). This attr is needed since timing tools can't determine the sense as they can't see the guts of logic. Unate can be +ve unate or -ve unate.

positive_unate : if rising/falling change on i/p causes o/p to rise/fall (same polarity),

negative_unate: if rising/falling change on i/p causes o/p to rise/fall i/p causes o/p to fall/rise (opposite polarity).

2. timing_type attribute:  distinguishes b/w comb and seq cell. If this attr is not defined, cell is considered combinatorial.  values defined for following timing arcs:

  • I. comb arc: timing arc attached to an o/p pin, and related pin is either i/p or o/p pin. timing arc has rise/fall_transition and cell_rise/fall for o/p pin wrt each i/p pin. It's used for all combo gates as AND, OR, etc. An arc from Clk to Q pin of a flop is NOT a combo arc (explained in seq arc)
    • A. combinational: means o/p can rise or fall. for positive_unate, arc is for R->R,F->F. for negative_unate, arc is for R->F,F->R. for non_unate, arc is for {R,F}->{F,R}
    • B. combinational_rise: rise means o/p is rising only. +ve_unate(R->R), -ve_unate(F->R), no_unate({R,F}->R})
    • C. combinational_fall: fall means o/p is falling only. +ve_unate(F->F), -ve_unate(R->F), no_unate({R,F}->F})
  • II. seq arc: It's either delay arc (clk and o/p data) or constraint arc (clk and i/p data). It's used for flops/latches, etc. The seq arc is from "related" pin to the "constrained" pin.
    • A. rising/falling_edge: arc whose timing o/p pin is sensitive to rising/falling signal at i/p pin. An ex is CLK->Q arc of a flop. Here when clk rises, o/p pin may rise or fall. It looks like a combo arc (i.e delay from i/p to o/p), but it's actually a seq arc, as the arc breaks here. We have a new timing arc start from clk pin to q pin. Another reason, it's not a combo arc is because o/p value changes only on +ve edge of clk and not on -ve edge (for a +ve flop). So, to differentitate this CLK->Q arc from pure combo clk->gclk arc, we write it as seq arc.
    • B. preset/clear: arc affect only the rise/fall arrival time of o/p pin. logic 1/0 is asserted on o/p pin. EX: SR latch has clear arc on "Q" pin wrt "SZ" pin, and preset arc on "Q" wrt "SZ" pin.
    • C. hold_rising/falling: designates rising/falling edge of related pin for hold check.
    • D. setup_rising/falling: designates rising/falling edge of related pin for setup check.
    • E. recovery_rising/falling: uses rising/falling edge of related pin for recovery check. clk is rising/falling edge triggered.
    • F. removal_rising/falling: used when the cell is low-enable latch or rising-edge triggered FF (for removal_rising) or the cell is high-enable latch or falling-edge triggered FF (for removal_falling). intinsic_rise/fall attr used along with this.
    • G. min_pulse_width: together with minimum_period value, specifies min pulse width for clk pin. can also be specified for other pins as set/reset, etc. Both *_high/low defined for clk pins, while *_high defined for active high set,reset pins while *_low defined for active low set,reset pins. Both high and low pulses need to have min width for clk, since there's a rising edge on both of them, and it may be missed, if it happens in a very small time (low pulse while clk is high, or high pulse while clk is low). If we want min_pulse_width to be specified in same format as other timing attributes, then we need to have related_pin set to same pin as i/p pin, and timing_type as "min_pulse_width". Then to specify min_pulse_width_high, we can specify rising transition with rise_constraint and have different values of high pulse width for different rising transition of pin. Similarly fall contraint means min_pulse_width_low. Usually min_pulse width should be greater than a gate delay in that tech, since the clk pulse passes thru several gates inside the flop, so a pulse less than a gate delay may be swallowed by the gate itself (i.e pulse may start dying before it even rose to 100%, since the delay is more than pulse width)
  •  III. nonseq arc: when setup/hold are specified on data pin with a non-clk pin as the related pin. The signal of a pin must be stable for a specified period of time before and after another pin of the same cell change state, for the cell to function as expected. Called nonseq since related pin is not clk. 4 possible arcs are non_seq_setup/hold_rising/falling. rising/falling edge are meant for related pin. These are called data to data paths.
    • Ex: SR latch has non_seq_setup/hold_rising arcs on "RZ"(data) rising wrt "SZ"(clk as related pin) rising and vice versa. This arc exists since when both RZ/SZ go inactive, o/p Q is uncertain depending on which pin went inactive first. Similar arcs for clrz wrt prez and vice versa for all flops/latches which have clrz and prez pins on them.
  •  IV: nochange arc: used for latch devices with latch enable signals. 4 possible arcs of nochange_high/low_high/low indicate +ve/-ve pulse on constrained pin and +ve/-ve pulse on related pin.

 




stdcells and their .lib arcs:

In PT, we can see all the arcs for a particular cell by typing:  report_lib <args> (see in PT_ETS.txt for more details). We'll use this cmd when looking at arcs for cells below. This will ensure our cell timing arc understanding is consistent with what Timing tool sees. Below are different kind of stdcells discussed, along with their timing arcs.

1. comb logic: combinatorial gates as AND, OR, etc. arcs are for o/p pin with related i/p pin. o/p pin rise/fall wrt each i/p pin. positive_unate/negative_unate indicates the dirn of input pin. 3 kinds:
A. Data path: Adders, comparators, etc. AD2 (half adder, S=A^B, CO=A&B), AD3 (full adder, S=A^B^CI, CO=A&B+A&C+B&C) SU2 (subtractor/comparator)
B. Gates: AN21/NA21 (2/3/4 i/p and/nand gate), BF09/BH03 (2 to 7 i/p Boolean functions), EN21 (2 i/p EX-NOR), EX22 (2/3 i/p EX OR), BU10/IV10 (buffers,tri-state buffers, inverters), OR31/NO31 (2/3/4 i/p or/nor gate)  
C. Multiplxer: MU111 (multiplexer). if multiplexer implemented using pass gates then it's no more comb, so special attributes have to be placed for such 1 hot mux)

Example arc for NAND gate: NOTE: AND has 2 gates in it (nand followed by inv). So, better to look at an nand.
 cell (NA210)  {
    version : 1.0;
    cell_leakage_power : 3.75; //avg (default) lkg power in pW (unit defined in top)
    area : 1.40;
    cell_footprint : AN2;

    leakage_power ()  {//lkg power for A=1, B=0
      value : 6;
      when : "A&!B";
    }
    leakage_power ()  {//lkg power for A=0, B=1
      value : 7;
      when : "!A&B";
    }
   pin (A)  {
      capacitance : 0.0065;//cap in pf. 0.006pf=6ff
      max_transition : 3.50; //max slew rate allowed on i/p pin is 3.5ns (for all cells)
      direction : input;
      fanout_load : 1; //fanout load defined as 1 for i/p pin (for all cells). this fanout load is used when calc FO at any o/p pin (FO load for all i/p pins at receiver added to get FO load at o/p of driver)
    }
    pin (B)  { //for i/p pin B
      capacitance : 0.0063;
      max_transition : 3.50;
      direction : input;
      fanout_load : 1;
    }
    pin (Y)  { //for o/p pin Y
      capacitance : 0.0000;
      max_capacitance : 0.11; //max cap allowd on pin Y is set to 110ff. assume pmos/nmos same size = x. So, i/p cap for EFO purpose = 1/1.5(n)+1(p)=1.66*6ff/2=5ff. max EFO=110/5=22. it's same as for invx1, as all x1 gates have same driving strength. When we goto size x2, max cap is set to 0.22 (since i/p drv strength is twice [i/p cap is 12ff], so max EFO is still 22)
      direction : output;
      function : "A&B";
      timing ()  {
        transport : "NO";
        related_pin : "A"; => related pin says with respect to which i/p pin is o/p delay based on. For flops with pin D, related pin would be CLK for setup or hold checks.
        timing_type : combinational; => refers to related pin dirn (for ex, if it's hold_rising, then rising refers to pin "A" dirn)
        timing_sense : positive_unate;

        rise_transition (transitiondelayload5slew6)  { //o/p slew rate
          index_1 ("0.0054,0.0162,0.0324,0.0486,0.0864");//o/p load in pf. NOTE: max cap in table here is 86.4ff, while max cap is set to 110ff. So, extrapolation is done.
          index_2 ("0.04,0.1,0.4,0.8,1.5,3.5");//i/p slew in ns (max i/p slew is 3.5ns)
          values (\
                  "0.1665,  0.1666,  0.1707,  0.1765,  0.1864,  0.2188",\ => 1st row is for index_1, entry 1
                  "0.3182,  0.3189,  0.3203,  0.3243,  0.3283,  0.3477",\ => each column is index_2 entry 1-6
                  "0.5517,  0.5515,  0.5516,  0.5549,  0.5566,  0.5669",\
                  "0.7859,  0.7859,  0.7844,  0.7864,  0.7886,  0.7946",\
                  "1.3322,  1.3305,  1.3300,  1.3316,  1.3326,  1.3363"); => 5th row is for index_1, entry 5
        }
        cell_rise (celldelayload5slew6)  { //delay thru cell
          index_1 ("0.0054,0.0162,0.0324,0.0486,0.0864");
          index_2 ("0.04,0.1,0.4,0.8,1.5,3.5");
          values (\
                  "0.2670,  0.2874,  0.3779,  0.4501,  0.5348,  0.6757",\
                  "0.3734,  0.3939,  0.4843,  0.5573,  0.6425,  0.7908",\
                  "0.5278,  0.5482,  0.6387,  0.7130,  0.7971,  0.9461",\
                  "0.6806,  0.7011,  0.7922,  0.8666,  0.9508,  1.0993",\
                  "1.0361,  1.0568,  1.1484,  1.2227,  1.3087,  1.4557");
        }
    fall_transition (transitiondelayload5slew6)  { ... }
        cell_fall (celldelayload5slew6)  { ... }
    //similarly for pin B
        timing ()  { ... }

    internal_power ()  {
        related_pin : "A";
        rise_power (outputpower_cap3_trans4)  { //pwr in pW when o/p pin Y is rising
          index_1 ("0.0108,0.0432,0.0864");
          index_2 ("0.1000,0.5000,1.2000,3.8000");
          values (\
                  "0.0246,  0.0242,  0.0270,  0.0409",\
                  "0.0255,  0.0244,  0.0257,  0.0367",\
                  "0.0257,  0.0248,  0.0251,  0.0338");
        }
        fall_power (outputpower_cap3_trans4)  { //pwr when pin o/p pin Y is falling
          index_1 ("0.0108,0.0432,0.0864");
          index_2 ("0.1000,0.5000,1.2000,3.8000");
          values (\
                  "0.0077,  0.0013,  0.0024,  0.0157",\
                  "0.0087,  0.0063,  0.0032,  0.0109",\
                  "0.0090,  0.0077,  0.0062,  0.0084");
        }
      }
      internal_power ()  { ... } //similarly for pin B.
     }
    }
  }

2. seq logic: Flops and latches. The name of flop/latches in libraries is such that it allows to distinguish b/w scan/no_scan, +ve/-ve, Clrz/Prez/both pins. as an example name XYZ=> X=D(no scan),T(scan). Y=N(-ve),T(+ve), Z=B(both),C(clr),P(preset),N(none). clr/preset are active low.

A. no scan flops: DNB10/DTB10(-ve/+ve, clr/preset), DNC10/DTC10(-ve/+ve, clr), DNN10/DTN10(-ve/+ve, none), DTP10(+ve, preset).

ex: Negative edge triggered D-FF, async active low clear, both Q and QZ outputs., 4X Drive
cell (DNC40) {
 ...//ff group: describes either a single stage or master-slave Flip Flop. ff_bank used to rep multi-bit flip-flop. 
ff ("IQ","IQZ")  { => IQ defines state of non-inverting o/p, while IQZ defines inverting output state (internal states of cross coupled inverters within the flop). These can be named anything except name of a pin in the cell being described.
      next_state : "D"; => required, it's a logic eqn written in terms of i/p pins or 1st state variable (IQ)
      clocked_on : "CLK'"; => required, identify active edge of clock signal (here CLK' indicates it's -ve edge triggered device). all pins listed here are treated as clocks by DC. For ex, for ff with CE pin, we can write clocked_on: "CLK & CE", but then we define clock attribute as true for CLK and false for CE.
      clear : "CLRZ'"; => optional, gives active value for clear input. here's it's CLRZ' => clrz bar
      preset : "xx"; =>optional, gives active value for preset input
      clear_preset_var1 : L; => this is there if both clrz,prez pins there. implies IQ=L if both clrz,prez active.
      clear_preset_var2 : L; => this is there if both clrz,prez pins there. implies IQZ=L if both clrz,prez active.
    }
 pin (CLK)  {
      min_pulse_width_high : 0.9572;
      min_pulse_width_low : 0.7352;
      capacitance : 0.0152;
      max_transition : 4.10;
      direction : input;
      fanout_load : 1;
      clock : true; => clock attribute needs to be set to true, so that DC treats this as clock.
...
}
pin (CLRZ)  {
      min_pulse_width_low : 0.6865; //clrz low pulse can't be < 0.68ns. This translates into $width check when running PT/gate_sims. No check for high pulse as high is inactive, so even if there's a high glitch, it's ok as o/p will still be low.
      capacitance : 0.0129;
      max_transition : 4.10;
      direction : input;
      fanout_load : 1;
...
      //timing: 4 arcs = recovery_falling/removal_falling wrt CLK (implies clk falling edge), non_seq_setup_rising/non_seq_hold_rising wrt PREZ. see top of this file for details on various arcs for all cells. Since related pin is "CLK" so timing arc is -from "CLK" pin -to specified pins (i.e -to CLRZ/PREZ etc). This is how seq timing arcs are written. They are always from "related" pin to "constrained" pin.
      timing() { //timing for removal_falling related to clk pin (clk pin falling since it's -ve edge flop)
        related_pin : "CLK"; //since related pin is CLK, arc is: -from CLK -to CLRZ
        timing_type : removal_falling;
        rise_constraint (constraint_slewref_6slewdata_6)  { //note that i/p pins use word "constraint" for timing arcs instead of cell_rise, etc as used for o/p pins. This has rise_constraint only as recovery/removal are for active to inactive edge only
        }
      }
      timing() { //timing for recovery_falling related to clk pin
        related_pin : "CLK";
        timing_type : recovery_falling;
        rise_constraint (constraint_slewref_6slewdata_6)  { //note this has rise_constraint only
        }
      }      
      timing ()  { //CLRZ rising (rise_constraint) should setup some time before PREZ rising (non_seq_setup_rising)
        related_pin : "PREZ"; //since related pin is PREZ, arc is: -from PREZ -to CLRZ
        timing_type : non_seq_setup_rising; //setup arc
        rise_constraint (constraint_slewref_6slewdata_6)  {  
        }    
      }
      timing ()  { //CLRZ rising (rise_constraint) should hold for some time after PREZ rising (non_seq_setup_rising)
        related_pin : "PREZ";
        timing_type : non_seq_hold_rising; //hold arc
        rise_constraint (constraint_slewref_6slewdata_6)  {  
        }    
      }
}
pin (PREZ)  {   //similar arcs for PREZ as for CLRZ
}

pin (D)  {
      capacitance : 0.0054;
      max_transition : 4.10;
      direction : input;
      fanout_load : 1;
...
      //pin D has 2 arcs, setup/hold wrt clk falling
      timing ()  { //pin D needs to setup with clk falling
        related_pin : "CLK"; //since related pin is CLK, arc is: -from CLK -to D
        timing_type : setup_falling;
        rise_constraint (constraint_slewref_6slewdata_6)  { //pin D rising edge setup. setup/hold arcs are dependent on D and CLK pin slew rates, and do not have dependence on o/p load. So, 2D table has index1 as clk_slew and index2 as data_slew
    }
        fall_constraint (constraint_slewref_6slewdata_6)  { //pin D falling edge setup
    }
      }
      timing ()  { //pin D needs to hold with clk falling
        related_pin : "CLK";
        timing_type : hold_falling;
        rise_constraint (constraint_slewref_6slewdata_6)  { //same for hold
    }
        fall_constraint (constraint_slewref_6slewdata_6)  {
    }
      }

}
pin (Q)  {
      capacitance : 0.0000;
      max_capacitance : 0.77;
      direction : output;
      function : "IQ";
 ...
      //Q pin has 4 arcs: delay arcs wrt PREZ falling, CLRZ falling and Q falling, CLRZ falling and Q rising, and CLK falling
      timing ()  { //Q rising wrt PREZ falling
        transport : "NO";
        related_pin : "PREZ";
        timing_type : preset;
        timing_sense : negative_unate;
        rise_transition (transitiondelayload6slew7)  {
    }
        cell_rise (celldelayload6slew7)  {
    }
      }

      timing ()  {//Q falling wrt CLRZ falling
        transport : "NO";
        related_pin : "CLRZ";
        timing_type : clear;
        timing_sense : positive_unate;
        fall_transition (transitiondelayload6slew7)  {
    }
    cell_fall (celldelayload6slew7)  {
    }
      }

      timing ()  {//Q rising wrt CLRZ falling. this happens since clrz has priority, so when both clrz,prez are low, then Q=L. But if clrz goes high, then Q goes high as prez is still active.
        transport : "NO";
        related_pin : "CLRZ";
        timing_type : preset;
        timing_sense : positive_unate;
        rise_transition (transitiondelayload6slew7)  {
    }
    cell_rise (celldelayload6slew7)  {
    }
      }

      timing ()  {//Q rise/fall wrt clk falling
        transport : "NO";
        related_pin : "CLK";
        timing_type : falling_edge;
        rise_transition (transitiondelayload6slew7)  {
    }
    fall_transition (transitiondelayload6slew7)  {
    }
    cell_fall (celldelayload6slew7)  {
    }
    cell_rise (celldelayload6slew7)  {
    }
     }
}
pin (QZ)  { //same arcs as those of Q
      capacitance : 0.0000;
      max_capacitance : 0.77;
      direction : output;
      function : "IQZ";
..
}
} => end of cell

In PT, for a regular flop with D, CLK and Q pins, we see these 4 arcs. NOTE that all arcs are "-from" CP pin (related pin) "-to" Q or D pin (constrained pin). Always keep that in mind when considering arcs.

pt_shell> report_lib -timing TSM_LIB {DFLOP_SVT}
****************************************

                            Arc                   Arc Pins
   Lib Cell  Attributes    #  Type/Sense      From        To         When
   ----------------------------------------------------------------------------
                 s         0  hold_clk_rise   CP          D          
                           1  setup_clk_rise  CP          D          
                           2  clock_pulse_width_high
                                              CP          CP         D
                           3  clock_pulse_width_low
                                              CP          CP         D
                           4  rising_edge     CP          Q          

B. scan flops: TNB11/TDB11(-ve/+ve, clr/preset), TNC10/TDC10(-ve/+ve, clr), TNN10/TDN10(-ve/+ve, none), TNP/TDP(-ve/+ve, preset). All these scan flops have test_cell group to identify them as scan cells.  
arcs for TDB11 are for:
 I.   prez pin: 4 arcs. 2 arcs are with clk as related pin, recovery_rising/removal_rising(implies clk rising) for prez rising. no falling edge arc as recovery/removal checks are only for async signal going from active to inactive. other 2 arcs are with clrz as related pin, non_seq_setup/hold_rising(implies clrz rising) for prez rising. again no falling edge arcs here.
 II.  clrz pin: 4 arcs same as for prez pin. recovery_rising/removal_rising with clk as related pin, and non_seq_setup/hold_rising with prez as related pin.
 III. Data pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for data pin rising and falling.
 IV:  Q pin: 5 arcs. 1 arc is with prez as related pin, "preset" arc for Q rising. 2 arcs are with clrz as related pin, "clear" arc for Q falling, and "preset" arc for Q rising. Note that for clrz related pin, we have "preset" arc also. this is because clrz has priority over prez, so when both clrz/prez are low, and then clrz goes high, then Q goes high. so, we have "preset" arc for Q rising with clrz as related pin. 2 arcs with clk as related pin, "rising_edge"(implies clk rising) for Q rising/falling.
 V:   SD pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for SD rising/falling. same as Data pin arcs.
 VI:  SCAN pin: 4 arcs. with clk as related pin, setup/hold_rising(implies clk rising) for SCAN rising/falling. same as Data pin arcs.

ex: Scan flop
cell (TDN10)  { ...
ff ("IQ","IQZ")  {
 next_state : " (D SCAN') + (SD SCAN) "; => states that next state is D when scan=0, and SD when scan=1
 clocked_on : "CLK";
}

//test_cell group: added to the cell desc to identify it as scan cell. this group defines only the non-test mode fn of scan cell.
test_cell ()  { => identifies this cell as scan cell
      ff ("IQ","IQZ")  { => model only the non-test cell behaviour here.
        next_state : "D"; => in no-test, next state=D
        clocked_on : "CLK";
      }
      pin (D)  {
        direction : input;
      }
      pin (CLK)  {
        direction : input;
      }
      pin (SD)  {
        direction : input;
        signal_type : test_scan_in; => scan_data_in
      }
      pin (SCAN)  {
        direction : input;
        signal_type : test_scan_enable; => scan_enable
      }
      pin (Q)  {
        function : "IQ";
        direction : output;
        signal_type : test_scan_out; => scan_data_out
      }
      pin (QZ)  {
        function : "IQZ";
        direction : output;
        signal_type : test_scan_out_inverted; =>
      }
}

C. latch (no scan): LAB10( nand SR latch), LAL10/LAH10(active low/high), LAH27(active high with clr/preset), LAH2B(active high with clr)
arcs for LAH27 are for: (clk pin has no arc but has "min_pulse_width_high" check and is tagged as "clock : true"). Note, a active high latch essentially behaves as -ve flop, so all arcs same as those for flop, except for Q pin comb arc from D->Q.
 I. prez pin:  4 arcs. 2 arcs are with clk as related pin, recovery_falling/removal_falling(implies clk falling) for prez rising. clk falling edge taken as latch turns off at falling edge of clk. 2 arcs are with clrz as related pin, non_seq_setup/hold_rising(implies clrz rising) for prez rising.
 II.  clrz pin: 4 arcs same as for prez pin. recovery_falling/removal_falling with clk as related pin, and non_seq_setup/hold_rising with prez as related pin.
 III. Data pin: 4 arcs. with clk as related pin, setup/hold_falling(implies clk falling) for data pin rising and falling.
 IV:  Q pin: 8 arcs. 2 arcs with prez as related pin, "preset" arc for Q rising (prez falling) and "clear" arc for Q falling (prez rising). 2 arcs with clrz as related pin, "clear" arc for Q falling (clrz falling) and "preset" arc for Q rising (clrz rising). Note that prez has priority over clrz here, so with prez as related pin "clear" arc exists for Q falling. But irrespective of that, whenever clrz or prez go high (while clk is high), then i/p Data will flow to Q, so with clrz rising,  "preset" arc exists for Q rising, and for prez rising, "clear" arc exists for Q falling. So, with prez as related pin, "clear" arc exists for Q falling in 2 ways:
    A. clrz=0, clk=0or1, and prez rises => Q falls (case of prez having priority)
    B. clrz=1, clk =  1, and prez rises => Q falls (case of D->Q path while clk active)
  2 arcs with clk as related pin, "rising_edge"(implies clk rising) for Q rising/falling. 2 arcs with Data as related pin, "combinational" for Q rising/falling.

ex: active high D-latch, async active low clear/preset, both Q and QZ outputs., 4X Drive
cell (LAH21) {
...//latch group below: describes level sensitive storage device. latch_bank used to rep multi-bit latch.  
latch ("IQ","IQZ")  { => IQ defines state of non-inverting o/p, while IQZ defines inverting output state (internal states of cross coupled inverters within the flop). These can be named anything except name of a pin in the cell being described.
      enable: "CLK"; => optional. specify enable (active high)
      data_in: "D"; => optional, data
      preset : "PREZ'"; => preset is active low (note ' at end of PREZ to indicate bar)
      clear : "CLRZ'"; => clr is active low
      clear_preset_var1 : H; => IQ (var1) =H when both preset and clear are active
      clear_preset_var2 : H; => IQZ (var2) =H when both preset and clear are active
    }
 pin (CLK) { .... clock: true; => clock attribute needs to be set to true, so that DC treats this as clock. No timing arcs.
 pin (D) { .. } => 2 timing arcs, setup_falling/hold_falling wrt CLK falling and D pin rise/fall constraint
 pin (CLRZ) or (PREZ) => these don't have any special attr. just treated as normal pins. They have 4 arcs: recovery_falling/removal_falling wrt CLK pin falling and CLRZ rising (rise_constraint), and non_seq_setup/hold_rising for pin CLRZ wrt pin PREZ rising (or for PREZ pin: non_seq_setup/hold_rising for pin PREZ wrt pin CLRZ rising)
 pin (Q) { ...//4 arcs: wrt clrz rise/fall, prez fall, clk fall and combinatorial arc for D rise/fall.
  function: "IQ"; => Q has same value as var IQ above. IQ=H when both clrz/prez active, so prez has priority
 pin (QZ) { ...
  function: "IQZ"; => QZ has same value as var IQZ above. IQZ=H when both clrz/prez active, so clrz has priority

D. latch(with scan): ADD DETAILS


3. clock cells: cells on clk path. CGN4/CGP4 (clk gaters), CTB20 (clk tree buffer)
arcs for CGP40 are for: (CG* cells have statetable instead of function, and then o/p pin uses "state_function" to define functionality)
 I. EN: 4 arcs with CLK as related pin, setup/hold_rising(clk rising) for EN rising and falling. clk rising since active low latch present. Note that arc has to consider path upto the "and" gate to calc setup/hold, since just meeting setup/hold to the latch i/p doesn't guarantee that EN signal will meet setup/hold to "and" gate.
 II. GCLK: state_function: "CLK * ENL", where CLK and ENL(internal node) values are in statetable. 2 "comb" arcs with clk as related pin, for o/p rise/fall.

ex: clk tree buffer
cell (CTB70)  { ...
cell_footprint : CTNIBUF; //Use this attribute to assign the same footprint class to all cells that have the same layout boundary. Cells with the same footprint class are considered interchangeable and can be swapped during in-place optimization. Cells without cell_footprint attributes are not swapped during in-place optimization. NOTE that all CTB are assigned same footprint, even thogh they have different layout boundary. similary for CG*, AN2*, etc. all cells from same class are assigned a footprint in TI lib files.
    dont_touch : true; //marked as don't touch, so that some opt step doesn't touch/remove it
    dont_use : true; //marked as don't use so that they are not used during for normal logic design (use only for clk tree)
...}

NOTE: cell_footprint is set to "NIBUF" (non inverting buf) for all buffers (BU110, BU120, etc) and set to "DELAYBUF" for all delay cells (BU112, BU113, BU116, etc). Tool identifies buffers/delay cells by looking at function stmt of cell which is "function : "A";". All delay cells are marked as "dont_use", so normal logic design doesn't use these delay cells to fix hold time.

ex: clk gating cell: CGP10 (passes EN when CLK is Low)
cell (CGP10)  {
    version : 1.0;
    cell_leakage_power : 2.204898E+01;
    area : 4.00;
    dont_use : true;
    dont_touch : true;
    cell_footprint : CGP;
    clock_gating_integrated_cell : "latch_posedge"; => this atr says to synthesis tool that it's integrated clk gating cell.

    statetable (" CLK EN","ENL")  { //("i/p node names", "internal node names")CLK, EN are input pins, ENL is defined as internal node. statetable is used to define fn of complex seq cells
      table : "L  L   : - : L ,\ => "i/p values : current internal value : next internal values". When clk=L, EN=L, ENL current value is - (whatever it's supposed to be), and ENL next value is L.
               L  H   : - : H ,\ => here also ENL is same as EN (as CLK is Low=active)
               H  -   : - : N  "; => no change in ENL
    }

    pin (ENL)  { //internal node ENL used to define statetable above
      direction : internal;
      internal_node : "ENL";
    }

    pin (CLK)  {
      ...
      clock : true;
      clock_gate_clock_pin : true; //clk gating attr defined
      internal_power () { .... }
    }

    pin (EN)  {
      ...
      clock_gate_enable_pin : true; //clk gating attr defined
      internal_power ()  { ... }
      //2 timing arcs: setup and hold for EN pin wrt CLK rising (note: arcs are for when clk goes inactive).
      timing ()  { //hold check for EN rise/fall
        related_pin : "CLK";
        timing_type : hold_rising;
        rise_constraint (constraint_slewref_7slewdata_7)  { ... }
        fall_constraint (constraint_slewref_7slewdata_7)  { ... }
      timing ()  { //setup check for EN rise/fall
        related_pin : "CLK";
        timing_type : setup_rising; ...
      }
    }

    pin (GCLK)  {
      capacitance : 0.0000;
      max_capacitance : 0.19;
      direction : output;
      clock_gate_out_pin : true; //clk gating attr defined
      state_function : " CLK * ENL "; //o/p is product of internal node ENL (defined above) and CLK. When CLK=0, o/p=0, but when CLK=1, o/p=ENL

      timing ()  { //c2q delay
        transport : "NO";
        related_pin : "CLK";
        timing_type : combinational;
        timing_sense : positive_unate;
        rise_transition (transitiondelayload8slew9)  { ... }
        fall_transition (transitiondelayload8slew9)  { ... }
        cell_fall (celldelayload8slew9)  { ... }
        cell_rise (celldelayload8slew9)  { ... }
      }
      internal_power ()  { ... }
   }
}

4. special cells:
A. PB110 (3 state bus holder) => no function specified as attribute "driver_type: bus_hold" is defined, indicating it's bi-dir pin, and it holds the last logic value when no-one is driving.
B. TO010 (tie-off cell) : used to tie constant values to these cells. tie-off cells are identified by looking at "function : "0 or 1" in the pin attribute.
DC will tie any contant net to this cell unless "set_direct_power_rail_tie" is used for that partcular net. Then, that net will be left floating during synth, but will be connnected directly to vdd/vss during PnR.

cell (TO010) {
      area : 1.75;
      cell_footprint :  TO010;
      pin(LO) {
          max_fanout : 50;
          max_capacitance : 100.04;
          direction : output ;
          function : " 0 " ; => this identifies it as tieoff cell for constant logic "0"
         }
      pin(HI) {
          max_fanout : 50;
          max_capacitance : 100.04;
          direction : output ;
          function : " 1 " ; => this identifies it as tieoff cell for constant logic "1"
         }
 }

5. missing cells:  antenna, decap, filler, tap cells.

A. decoupling cells, filler cells and tap cells: decap cells, are cells that have a capacitor placed between the power rail and the ground rail to overcome dynamic voltage drop; filler cells are used to connect the gaps between the cells after placement; and tap cells are physical-only cells that have power and ground pins and do not have signal pins. Tap cells are well-tied cells that bias the silicon infrastructure of n-wells or p-wells (to connect body/substrate of all devices). All of these are identified by using these attributes for cells:
cell (cell_name) {
¡­
is_decap_cell : <true | false>;
is_filler_cell : <true | false>;
is_tap_cell : <true | false>;
¡­
}          

NOTE: since these are physical only cells (no logic function or timing), we usually don't put these cells in .lib file. They only exist in *.lef file. Some Synopsys tools will complain about this, since they don't find the correct attribute on the cell (as it's missing in .lib). However, we can create a physical only .lib, and we can put all these cells in there (especially the decap cells). Then we don't see the warnings. Or, we should not put these cells in netlist during synthesis.
ex: decap cell in *PHYS.lib
cell (SPAREMOSCAP) {
  area : 0.75; //no other attribute besides area.
}


B. antenna cells => used to fix antenna violations. It just has a nmos whose gate is tied to vss, and src/drn are tied to i/p A.
NOTE: function is not defined for Antenna Protection cell.
  cell (AP001)  {
    version : 1.0;
    cell_leakage_power : 3.828184E+00;
    area : 1.00;
    dont_use : true;
    dont_touch : true;
    cell_footprint : DIODE;

    leakage_power ()  {
      value : 3.884488E+00;
      when : "A";
    }

    leakage_power ()  {
      value : 3.771880E+00;
      when : "!A";
    }

    pin (A)  {
      capacitance : 0.0028;
      direction : input;
      fanout_load : 1;

      internal_power ()  {
        rise_power (inputpower_trans5)  {
          index_1 ("0.0100,0.2000,1.0000,2.0000,4.0000");
          values ("-0.0001, -0.0001, -0.0001, -0.0001, -0.0001");
        }
        fall_power (inputpower_trans5)  {
          index_1 ("0.0100,0.2000,1.0000,2.0000,4.0000");
          values ("0.0001,  0.0001,  0.0001,  0.0001,  0.0001");
        }
      }
    }
  }


----------------

delay models:

To calculate any delay thru a path, timing tool must accurately calculate the delay and slew (transition time) at each stage of each timing path. A stage consists of a driving cell, the annotated RC network at the output of the cell, and the capacitive load of the network load pins. Models are employed for driver, wire network and receiver load. The driver model  models any cell as a driver (which may be current or voltage source). The wire network is modeled as reduced RC network. Reduced RC network should behave same as original RC network at all frquencies, but allows lot lower computation to calculate delays (PT uses Arnoldi reduction method)The receiver is simply a capacitance. However, the cap may vary depending on rise/fall transition on receiver, min/max condition, miller effect (cap changing due to coupling b/w i/p and o/p, where o/p is changing simultaeously while input is changing), etc. To account for this, models also uses a receiver model to account for this cap as accurately as possible. 2 delay models widely in use:

1. NLDM: (non linear delay model)

For simple NLDM model, driver is a linear voltage ramp in series with a resistor. This is captured via a lookup table, instead of having equations which are more time consuming. The simple LUT model (aka NLDM) employed above works for 22nm tech and above. It specifies delay at midpoint ( at 50% rise or fall). We specify o/p delay + o/p transition time for different i/p slew rate and different o/p load, via a LUT. So, slew rate (b/w 20% to 80% rise or fall with linear slope) and delay (b/w 50% rise/fal to 50% rise/fall) are 2 important parameters that define the shape of o/p waveform (o/p load and i/p slope are used as indexes). However in this simple table, we do not capture the exact waveform of input or output of cell. It's a fixed o/p transition slew rate. This starts adding inaccuracies in delays when compared to spice models. Using a more complex CCS model allows us to capture the waveform more accurately, which is needed for tech < 22nm to get timing results within 2%-5% of spice results. The receiver model NLDM uses is a single cap value for a given timing path. However, cap values may be different based on rise/fall or min/max conditions.

 

2. CCS: (constant current source model)

CCS model was developed to reduce inaccuracies at 20nm and lower tech. It uses constant current source model (constant current source implies infinite driver strength). It models driver as time varying current source. It can handle high resistive nets (driven by fast drivers), which is a problem for NLDM. CCS receiver model uses 2 cap values for each timing arc. It uses cap C1 for receiver voltage going upto the midpoint of VDD, and then uses C2 for going from the midpoint of VDD to the end. This models miller cap more accurately. For receiver cap, we specify 2D tables for both rise/fall at i/p of receiver. 2D tables are for 4 parameters: receiver_capacitance1_rise, receiver_capacitance1_fall, receiver_capacitance2_rise, receiver_capacitance1_fall. We see at 7nm and below that C1 and C2 themselves differ by upto 20%, and they also vary by as much as 50% across different i/p slew rate and o/p load. So, that signifies the importance of having these receiver models in CCS across diff slew rate and load.


Representing Composite Current Source (CCS) Driver Information: In the Liberty syntax, using CCS model, you can represent nonlinear delay information at the pin level by specifying a current lookup table at the timing group level that is dependent upon input slew and output load. CCS describes each CCS driver switching current waveform by adaptively sampling data points. So basically we take the 2D lookup table from NLDM, and instead of specifying single transition time for each i/p slew and o/p load, we provide current value at different points in time.
To define your lookup tables, use the following groups and attributes:
1. output_current_template group in the library group level
2. output_current_rise and output_current_fall groups in the timing group level

Example of cell:

cell (AOI21_LVT) {

   pin(A) { /group for i/p pin. /similarly for all other i/p pins

    direction: input; // many other attributes defined

     receiver_capacitance () { ... } => tables for different index, and for different cond (when: "!A1&A2)

    internal_power () { ... } => tables

  }

  pin(Z) { //group for o/p pin

    direction: output; // many other attributes defined as function, etc

    internal_power () { ... } => tables for each related i/p pin for diff condition

   timing () { //for each related_pin, there may be more timing groups for each condition

    related_pin: "A";  //similarly for related_pin B, etc

    when: "!A&B"; //similarly for diff condition

   cell_rise (delay_8x8) {  ... } // similarly for cell_fall, rise_transition, fall_tarnsition

  ocv_sigma_cell_rise (delay_8x8) { sigma_type: early; ... } //EARLY: similarly for ocv cell_fall, rise_transition, fall_tarnsition

  ocv_sigma_cell_rise (delay_8x8) { sigma_type: late; ... } //LATE:

  ccsn_first_stage () {

    stage_type: both; //many more attr as "when, etc

     dc_current (ccsn_dc_template) { ... }

     output_voltage_fall () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for o/p voltage rise

     propagated_noise_high () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for noise_low

   }

    receiver_capacitance1_rise () { ... }  //similarly for cap1_fall, cap2_rise, cap2_fall
    output_current_fall () { vector (template1) { ... } vector (template1) { ... } ...} //similarly for o/p current rise. These tables are big as they have current values for lot of time samples for each i/p slew and o/p load

  } //end of timing group

 

timing () {

    related_pin: "B";

Example of lib:
library (new_lib) {
...
 output_current_template (CCT) { //template for CCS => o/p current waveform wrt 3 var below
  variable_1: input_net_transition;
  variable_2: total_output_net_capacitance;
  variable_3: time;
 }

  lu_table_template (ccsn_prop_template) { //template for noise
    variable_1 : input_noise_height;
    variable_2 : input_noise_width;
    variable_3 : total_output_net_capacitance;
    variable_4 : time;
  }

dynamic_current () { => this models dynamic current at power pins (VDD/VSS) of a gate (here inverter) with both rise/fall at i/p. This can be used to calculate dynamic peak IR more accurately. In absence of this, we use "fixed current" at power pins throughout the switching, which is not so accurate.

    related_inputs : "I";
    related_outputs : "Z";
    switching_group () {
      input_switching_condition (fall);
      output_switching_condition (rise); //o/p is rising, so current waveform is primarily thru VDD as it charges cap, however some short circuit current also flows thru VSS
      pg_current (VDD) {
        vector (ccsp_template2) {
          reference_time : 0.00138;
          index_1 ("0.0023"); //slew rate at i/p of gate
          index_2 ("0.00023"); //load on o/p of gate
          index_3 ("0, 0.0005314, 0.001445, 0.002875, 0.00306608, 0.00314519, 0.00327175, 0.00976063, 0.0144103, 0.0188013, 0.0204694, 0.0249671, 0.0381518, 1.44866"); // these are time delay from reference point of 0.00138 units
          values ( \
            "8.7941e-07, 0.0714941, 0.0554155, 0.109314, 0.0744151, 0.0740751, 0.0750564, 0.0518026, 0.0122771, 0.00195741, 0.000952697, 0.000121253, 1.55532e-07, 1.73258e-06" \ //as can be seen, current is almost 0 at start and end, but goes theu a peak in between. +ve values imply current is getting pulled out of VDD.
          );
        }

           vector (ccsp_template2) { // we repeat above table multiple times for differnt slew rates and load. they may end up with different refrence time depending on delay thru cell
       }

       pg_current (VSS) { //similarly for VSS pin. NOTE that for VSS, current values are -ve (implying current is pushed into VSS), and they are of much smaller magnitude than VDD current, as it's only small amount of short circuit current
        vector (ccsp_template2) {
          reference_time : 0.00138;
          index_1 ("0.0023");
          index_2 ("0.00023");
          index_3 ("0, 0.000670518, 0.00175046, 0.002875, 0.00300388, 0.00345785, 0.00367968, 0.00392382, 0.00409207, 0.00437108, 0.00472153, 0.00507438, 0.00524245, 0.00675625, 0.00699046, 0.00995021, 0.0131331, 0.0144103, 0.015075, 0.0167913, 0.0188013, 0.0204694, 0.0225021, 0.0249671, 0.0467915, 1.44078, 1.44866");
          values ( \
            "-8.82073e-07, 0.0923592, 0.0406992, 0.0189937, -0.00807851, -0.0116778, -0.0111549, -0.012948, -0.0129895, -0.0129045, -0.0145467, -0.0128898, -0.0144827, -0.0128716, -0.0132787, -0.00995613, -0.00380439, -0.00217233, -0.00167868, -0.000805295, -0.000346744, -0.000158811, -7.18249e-05, -1.5272e-05, 5.02506e-06, -8.37263e-06, 5.12667e-06" \
          );
        }

   switching_group () { //repeat above group for other dirn, i.e rise at i/p
      input_switching_condition (rise);
      output_switching_condition (fall);
      pg_current (VDD) { ... } //NOTE: current values for VDD are -ve here, while VSS are -ve too (implying current is pushed into both VDD and VSS here, maybe because of ripple at o/p which causes o/p voltage to be higher than VDD)
      pg_current (VSS) { .. } //similarly for VSS. VSS current lot higher than VDD current as only small amount of short circuit current flows thru VDD

  }

}//end of dynamic current section
...

pin(Z) { ...
 timing() { //For CCS, timing section has extra CCS LUT

   cell_rise (delay_tem...) { .... } //regular NLDM LUT is also present here, so that NLDM will be used if specified in the tool

   ccsn_first_stage () { //This specs CCS for first stage of gate (channel connected block or CCB) if gate has multiple stages inside it. For ex, AND gate has nand followed by inverter. So, we repeat this section for last_stage too.
        is_inverting : true;
        is_needed : true;

        when: "A&!SE|SD"; //all CCS values below can be defined condition based
        miller_cap_fall : 0.000207711;
        miller_cap_rise : 0.000205185;
        stage_type : both;
        dc_current (ccsn_dc_template) { //2D  dc current table which lists the DC current measured at CCB  o/p node, with indexes specifying i/p node and o/p node voltage
          index_1 ("-0.95, -0.475, -0.19, -0.095, 0, 0.0475, 0.095, 0.1425, 0.19, 0.2375, 0.285, 0.3325, 0.38, 0.4275, 0.475, 0.5225, 0.57, 0.6175, 0.665, 0.7125, 0.76, 0.8075, 0.855, 0.9025, 0.95, 1.045, 1.14, 1.425, 1.9"); //i/p voltage
          index_2 ("-0.95, -0.475, -0.19, -0.095, 0, 0.0475, 0.095, 0.1425, 0.19, 0.2375, 0.285, 0.3325, 0.38, 0.4275, 0.475, 0.5225, 0.57, 0.6175, 0.665, 0.7125, 0.76, 0.8075, 0.855, 0.9025, 0.95, 1.045, 1.14, 1.425, 1.9"); //o/p voltage
          values (  "0.436551, 0.363591, 0.349409, 0.343731, 0.337281, 0.333664, ... ", ) //and so on ..

        }

  output_voltage_rise() { //voltage waveforms are not important in CCS, as currents are used to come up with delay and slew at o/p (I=Cdv/dt, So, deltaV can be calculated from i(t) and C). So, we see very few vectors for voltage waveform, but a lot for current waveform

          vector (ccsn_vout_template) {
            index_1 ("0.02306"); => i/p tran
            index_2 ("0.0018245"); => o/p cap
            index_3 ("0.0215692, 0.0266808, 0.03194, 0.0380183, 0.0470364"); => time
            values ( \
              "0.095, 0.28, 0.475, 0.66, 0.82" \ => provides sample points of o/p voltage. voltage is 0.09V at 21ps, then 0.28V at 26ps, and so on ..
            );
          }

         vector (ccsn_vout_template) { ... } //this is repeated for diff slew rates and load
   }

 output_voltage_fall() { ... }


  output_current_rise() { //most important section for CCS. It provides detailed current waveform at all possible i/p slew and slow load. So, for 7x8 NLDM LUT, there would be about 56 (7*8) vectors here. So, this section usually long
   vector(CCT) {
    reference_time : 0.05; =>
    index_1(0.1); => i/p tran
    index_2(2.1); => o/p cap
    index_3("1.0, 1.5, 2.0, 2.5, 3.0"); => time
    values("0.0003, 0.007, 0.022, 0.027, 0.028" ); => current values of the driver model for current rising at o/p. NOTE: current is not in shape of bell curve here, not sure why, maybe the rise time is very sharp, so not captured here
    }

   vector(next1) { .. } //for other slew rates and load

  }  
   }
  }


  output_current_fall() { ... }

 

       propagated_noise_high () { // This is to be able to run noise runs. It propagates noise thru the cell, and shows how o/p waveform looks for different i/p waveform
          vector (ccsn_prop_template) { //similarly for other vectors
            index_1 ("0.595548"); => i/p noise height
            index_2 ("0.283096"); => i/p noise width
            index_3 ("0.0018245"); => o/p cap
            index_4 ("0.141002, 0.154822, 0.183612, 0.20954, 0.226069"); => time
            values ( \
              "0.810785, 0.727257, 0.671571, 0.727257, 0.810785" \ => waveform of o/p noise sampled at various times. At t=0.14, V=0.8V(which is =VDD), then it dips a little, then goes back to VDD. For noise_low, it will be bump from VSS, back to VSS
            );
          }
    propagated_noise_low () {  ... } //for low noise

receiver_capacitance1_rise (delay_template_7x7_0) { //NOTE: these cap values are for o/p pins, not sure why we need for o/p pins, when we have it for i/p pins
        index_1 ("0.00205853, 0.00859214, 0.0216594, 0.0477043, 0.0998837, 0.204153, 0.412781");
        index_2 ("0.00023, 0.00081, 0.00196, 0.00426, 0.00887, 0.01807, 0.03649");
        values ( \
          "0.000400186, 0.000424999, 0.000440668, 0.000447652, 0.000451636, 0.000453669, 0.000454712", \
          ....
          "0.000527481, 0.000507608, 0.000493188, 0.000483650, 0.000477769, 0.000473495, 0.000472111" \
        );
      }
 receiver_capacitance2_rise (delay_template_7x7_0) { .. }

receiver_capacitance1_fall (delay_template_7x7_0) { .. }

receiver_capacitance2_fall (delay_template_7x7_0) { .. }

} //end of ccsn_first stage

ccsn_last_stage () { .... } //repeat whole section above for last stage if more than 1 stage present in stdcell. NOTE: last stage is important one for any stdcell, as we care about what comes at the o/p of cell, and not much about happens on internal nodes. Usually, if stdcell has only 1 stage, we only have values for ccsn_first_stage (which is actually the last stage). If stdcell has multiple stages, then ccsn_first_stage is very small (has only voltage and noise waveforms, no other groups)

   internal_power () {

      related_pin : "I";
      related_pg_pin : VDD;
      rise_power (power_template_7x7_0) { .. } //tables for both rise and fall power. Only shown for VDD pin, as power is delivered via VDD only
      fall_power (power_template_7x7_0) { .. }
    ...
   }
  }
 }
}
 
NOTE: there may be too many such arcs to rep current adequately at each slew rate and load. So, we also have compact CCS rep in .lib, so that .lib file doesn't grow tremendously.

Variations in process parameters: To account for this, new extensions added to liberty

Liberty Variation Format (LVF): These are extension to lib format. They are used to specify variation parameters which are needed for OCV timing analysis. Many new groups defined for LVF. We can use these groups in regular .lib files, as long as the tools support reading these LVF groups.

timing () {

 cell_rise (delay_temp_8x8) { //regular cell delay for rise

       index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
        index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
        values ( \
          "0.00814129, 0.0102127, 0.0141374, 0.0218088, 0.0370626, 0.067524, 0.128439, 0.250268", \
          ...
          "0.17358, 0.190633, 0.214257, 0.245593, 0.287164, 0.341309, 0.429989, 0.585219" \
        );
      }

 ocv_sigma_cell_rise (delay_temp_8x8) { //sigma values for cell delay rise. Each value specifies 1 sigma delta from nominal delay value above. Used in POCV analysis. Here sigma value is different for different slew/load.

sigma_type : early;
        index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
        index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
        values ( \
          "0.000311555, 0.000401142, 0.000580422, 0.000937877, 0.00165292, 0.00308312, 0.00594014, 0.011653", \ => Here, 0.0003 is the 1 sigma offset from mean of 0.0081 specified above for given load/slew rate. So, offset is about 5% from mean, which can be significant when added across multiple gates. Also, note that sigma offset as a % of mean delay is diff for diff load/slew rate, so having single sigma offset value would have given inaccuracies.
     ....
          "0.0101789, 0.0101964, 0.0102317, 0.0103037, 0.0104537, 0.0107767, 0.0143185, 0.021458" \
        );
      }
      ocv_sigma_cell_rise (delay_template_8x8) {
        sigma_type : late;
        index_1 ("0.0019, 0.0058, 0.0137, 0.0295, 0.061, 0.1241, 0.2502, 0.5025");
        index_2 ("0.00016, 0.00088, 0.00232, 0.00519, 0.01093, 0.02241, 0.04538, 0.09131");
        values ( \
          "0.000391953, 0.000508076, 0.000740406, 0.00120356, 0.00212998, 0.00398288, 0.00765485, 0.0149973", \
        ...
          "0.0104575, 0.010507, 0.0106064, 0.0108063, 0.0112123, 0.0120466, 0.0167537, 0.026188" \
        );
      }

  ... }