General Synthesis Flow:

Synthesis transforms RTL into gate netlist. Since the goal of synthesis tool is not only to map the RTL into gates, but also to optimize the logic to meet timing, power and area requirements, it needs few other inputs to do the job.

Input = RTL (with pragmas), constraints (sdc) and timing libraries in Liberty format(.lib) needed.

Output = gate level verilog netlist.

Good Synthesized netlist needed as when die utilization approaches 95% to 100% (red zone), meeting timing becomes difficult. So, 3-5% reduction in area keeps the design away from red zone.

Synthesis Tools:

2 most widely used tools for synthesis are provided by Synopsys and Cadence. Synopsys provides DC (design compiler), while Cadence provides RC (RTL compiler)

Synthesis Inputs:

1. RTL:

We write RTL in any HDL language as verilog, system verilog or VHDL, and all synthesis tools are able to synthesize it.

Synthesis pragmas:  These are special helper comments. They are put inside comments (og verilog or vhdl file) preceeded by synopsys or cadence word so that DC/RC can identify them.

cadence pragmas: 2 places to put it in
// cadence pragma_name => single line comment
/* cadence pragma_name */ => mutliline comment

synopsys pragmas: Similarly Synopsys pragmas can also be put in 2 ways:

//synopsys pragma_name => single line comment

/* synopsys pragma_name */ => mutliline comment

pragma names:


I. parallel_case: used in case stmt to specify that case stmt are non-overlapping (1 hot).
ex:
case(1'b1) //cadence parallel_case
sel[0]: out = A[0]; //when sel[1:0]=01 or 11, out=A[0], as 1st matching case stmt is executed.
sel[1]: out = A[1]; //when sel[1:0]=10, out=A[1]. when sel[1:0]=00, then latch inferred, since no default case defined.
endcase

if the pragma wasn't there, then priority logic would be built, since if both sel[0] and sel[1] are 1, then the first matching case stmt is executed, so out=A[0] in such a case. All case stmt are treated as non-paallel for synthesis purpose, since that is how RTL is simulated.
out= (sel[0] and A[0]) or (!sel[0] and sel[1] and A[1]);

however, since pragma is there, no priority logic is built as shown. So if sel[1:0]=11, out=A[0] or A[1]; So, having the prgma saves unneeded priority logic, so keeps gate count lower.
out= (sel[0] and A[0]) or (sel[1] and A[1]); => this may result in mismatches in formal verification or simulation if sel[1:0]=11 is applied.

II. map_to_mux or infer_mux: used in case and if-then-else stmt to force RC to use MUX from library.
case (sel) //map_to_mux => forces mux, meaning RC doesn't optimize this logic to seek other logic
2'b00: out = A;
2'b01: out = B;
2'b10: out = C;
2'b11: out = D;

III. infer_multi_bit pragma => maps registers, multiplexers and 3 state drivers to multibit libraray cells.


2. Timing Library (in liberty or other proprietary format):

The gate library that we use during synthesis is the timing gate library. It's in liberty format. It has timing info for each gate, as well as the functionality of each gate. Using the functionality info, the synthesis tool is able to map the gates to RTL logic, and using the timing info, it's able to check if it's meeting the timing requirement of the design. The question is which timing librrary should we use? Should we use timing for typical corner or max or min corner? Since we want to design out chip such that it meets timing even in worst possible scenario, we choose "worst case" timing library, which is the max delay library.


Example of max delay lib: Let's assume the chip runs at 1.8V typical. Since we design the chip so that it should also run at +/-10% voltage swings (due to IR drop, overshoot, etc). our worst case PVT corner would be Process=weak, Voltage=1.65V amd Temperature=150C (since high temp slows transistors). So, liberty file such as W_150C_1.65V.lib would be used. Not all lib cells may be in one library. So, we may use multiple libraries. As an ex, al core cells may be in *CORE.lib, while all Clock tree cells may be in *CTS.lib

NOTE: no tech/core lef or cap tables provided, as net delay estimated based on WLM (wire load model) which has resistance/cap defined per unit length (length is estimated based on Fanout). If physical synthesis is done, which tries to do physical placement during synthesis itself, then WLM is not used (as in RC PLE or DC-topo, both of which do physical based synthesis). In such a case, core lef file and cap table files are provided.

3. Constraints (in sdc format):
In constraints file, we specify all the constraints that our synthesis tool will try to honor. Constraints are of 2 types: Environment constraints and Design constraints. Both of these are provided via an SDC file.

Timing constraints: One of the most important design constraints in sequential digital design is clock frequency. The tool tries to meet timing once clk freq or lk waveform is given. For input/output ports, we also provide the IO delay

Invalid paths: We provide false paths or multicycle paths for paths that a re not valid 1 cycle paths. In false_paths, we define all false paths on gate level netlist.

While synthesizing, Synthesis tool optimizes setup for all data paths and clk gating paths. No hold checks or async recovery/removal checks done.
Once the tool synthesizes RTL, and meets setup time, it's done. No clk propagation done, and no hold fix done (although both setup and hold timing reports are produced). Hold rpt should have no failure as clk is ideal, and c2q of flop is enough to meet hold time as hold time for most flops is -ve (This is because of extra delay in data path, which makes setup time more +ve and hold time less +ve. worst case for hold time is very small +ve number. NOTE: more delay in data path inc setup time, dec hold time while more delay in clk path dec setup time, inc hold time).


Optimization priority: Not all constraints that we specify have equal priority. Highest priority is given to constraints that can make a chip malfunction (i.e timing constraint), while lowest prioroty is given to constraints that are good to meet, but don't make a chip malfunction (i.e power constraint0

Below are various cost types for various constraint. Basically all these constraints end up as some cost in a big cost function, and the tool's job is to minimise this cost. DC from synopsys uses cost types to optimize design. Cost types are design rule cost and optimization cost. By default, highest priority to design rule cost (top one) and then priority goes down as we move to bottom ones.
1. design rule cost         => constraints are DRC (max_fanout, max_trans, max_cap, connection class, multiple port nets, cell degradation)
2. optimization cost:
 A. delay cost          => constraints are clk period, max_delay, min_delay
 B. dynamic power cost         => constraints are max dynamic power
 C. leakage power cost         => constraints are max lkg power
 D. area cost              => constraints are max area

Power optimization:

Above 90nm, power opt used to be a low priority. But with leakage power increasing, and desire to have chips last longer on battery power, optimizing chip power has become a high priority for chips going into handheld devices. These are the few of the techniques for reducing power:

1. Clock gating: Here clock gating logic is insertef for register banks (i.e a collection of flops). This reduces switching of clk every cycle, since we disable the clk when data is not being written into registers. Clock gating is inserted either when RTL has clock gating coded, or the tool can automatically infer a clock gating logic and insert clk gating logic.

ex: See clk gaters below

2. Leakege power opt: Lkg power is becoming a larger portion of overall power for low nm tech (<90nm). Multiple threshold voltages are used to reduce lkg power.

3. Dynamic power opt: Dynamic power opt consists of 2 power components: 1. short circuit power 2. switching power due to charging/discharging of net/gate caps (due to transistors switching)

4. Advanced Power management techniques: Here we employ advanced power techniques. These techniques are captured in a UPF/CPF file (see Power intent and standards).

  • MSV: Using multiple supply voltages (MSV) in design: This technique is most widely used. We use lower voltages to power parts of design, which don't need to run that fast, while use higher voltages for logic that are performance critical. This can result in huge power savings as dynmaic power varies as square of voltage.
  • PSO: Using power shut off (PSO) methodology: Here, some parts of design are switched on and off internally depending on their usage at that time. This saves both leakage and dynamic power.
  • DVFS: Using Dynamic voltage frequency scaling (DVFS): Here, voltage and frequency of parts of chip or whole chip are scaled down when peak peak perf is not required. DVFS can be seen as a special case of MSV design operating in multiple design modes.


FLOW:

Below is the flow for running synthesis. Specific flow scripts will be explained in detail in the sections for DC and RC/Genus. This is more general explanation.

  1. In init file, specify lib, lef, etc. Set other attr/parameters.
  2. Read RTL, elaborate, and check design.
  3. Set Environment Constriants using SDC file => op_cond (PVT), load (both i/p and o/p), drive (only on i/p), fanout(only on o/p) and WLM. dont_touch, dont_use directives also provided here.
  4. Do Initial synthesis with low effort, since we just need gate netlist from RTL to write our false path file. Write initial netlist.
  5. Set design constraints using SDC file => case_analysis, i/p,o/p delays, clocks/generated clocks, false/multicycle paths. We use case_analysis to set part in func mode (set scan_mode to 0, since we are not interested in timing when part is in scan mode). Strictly speaking, this is not required, but then reports may become difficult to read. So, over here we set scan_mode to 0 to see func paths only. Later during PnR, we run timing separately with scan_mode set to 1, so that we see timing paths during scan_mode. Thus we are covered for both cases of scan_mode.
          IMP: Do NOT force scan_en to 0, as that's real path and we want to see paths both during scan_capture mode as well as during scan_shift mode. If we force scan_en to 0, then scan_shift paths are removed from analysis altogether. Many of these paths fail hold time, so it's OK in synthesis flow, but in PnR flow, we want these paths to be fixed for both setup and hold violations. Since we use the same case_analysis file in PnR, we don't want to set scan_en to 0.
  6. Do Final synthesis with high effort. Report timing, area and other reports. Write Final non-scan netlist.
  7. For SCAN designs, we need to add scan pins, convert flops to scan flops, stitch them, and spit out a scan netlist. Below are the additional steps needed.
    1. set below scan related settings:
            A. set ideal_network attr for scan_en_in pin, so that DC/RC doesn't buffer it. We let PnR tool buffer it.
            B. set false_path from scan_en_in pin (ending at clk of all flops). Otherwise large tr on scan_en_in causes huge setup/hold viol.
            C. set other dft attr. define test protocol, and define scan_clk, async set/reset, SDI, SDO, SCAN_EN and SCAN_MODE.
            D. do dft DRC checking and fix all violations.
    2. Replace regular FF with scan flops, connect chain, do dft DRC checking, print timing, area and other reports. Write Final scan netlist. Synthesize again if needed (not needed since timing is usually met).

NOTE: clock, reset and scan_enable should not be buffered in DC/RC, as that's taken care of in PnR much better as layout is avilable. However, most of the times, synthesis scripts end up buffering the reset path during synthesis, which is not a good practise.

 

Library cells:

Below are some examples of RTL and their synthesized netlist. This or a very similar netlist would most likely be spit out of any Synthesis tool. i generated the netlist using Synopsys DC tool.

1. Flop:

RTL:
module flop (input Din, input clk, output reg Qout);
always @(posedge clk) Qout<=Din;
endmodule

Synthesized Gate:
module flop (input Din, input clk, output Qout); //NOTE: Qout is no more a reg, it's a wire.
FLOP2x1 Qout_reg (.D(Din), .CLK(clk), .Q(Qout), .QZ()); //name of flop is output port followed by _reg
endmodule

2. Clk Gaters:

RTL:
always @(posedge clk) begin
 if (En) Qout <= Din;
end

Synthesized Gate:
module SNPS_CLOCK_GATE_HIGH_spi_0 ( CLK, EN, ENCLK, TE );
  input CLK, EN, TE;
  output ENCLK;
  CGPx2 latch ( .TE(TE), .CLK(CLK), .EN(EN), .GCLK(ENCLK) );
endmodule

module AAA ( ... );
SNPS_CLOCK_GATE_HIGH_spi_0 clk_gate_Qout_reg ( .CLK(clk), .EN(En), .ENCLK(n38), .TE(n_Logic0) ); //Test Enabme tied to 0, since non-scan design
FLOP2x1 Qout_reg ( .D(Din), .CLK(n38), .Q(Qout) );
endmodule

3. Adders:

RTL: Z = A + B; //assume 6 bits

Synthesized Gate:
module add_unsigned_310(A, B, Z);
  input [5:0] A;
  input [6:0] B;
  output [7:0] Z;
  wire [5:0] A;
  wire [6:0] B;
  wire [7:0] Z;
  wire n_0, n_2, n_4, n_6, n_8;
  assign Z[7] = 1'b0;
  FA320 g97(.A (n_8), .B (B[5]), .CI (A[5]), .CO (Z[6]), .S (Z[5])); //Full adder, S = (A EXOR (B EXOR CI)) and CO = (A & B) + (B & CI) + (CI & A), 2X Drive
  FA320 g98(.A (n_6), .B (B[4]), .CI (A[4]), .CO (n_8), .S (Z[4]));
  FA320 g99(.A (n_4), .B (B[3]), .CI (A[3]), .CO (n_6), .S (Z[3]));
  FA320 g100(.A (n_2), .B (B[2]), .CI (A[2]), .CO (n_4), .S (Z[2]));
  FA320 g101(.A (n_0), .B (A[1]), .CI (B[1]), .CO (n_2), .S (Z[1]));
  HA220 g102(.A (A[0]), .B (B[0]), .CO (n_0), .S (Z[0])); //Half adder, S = (A EXOR B), CO = (A & B), 2X Drive
endmodule

module aaa ( ... );
add_unsigned_310 add_115_47(.A (A[5:0]), .B ({1'b0,B[5:0]}), .Z ({UNCONNECTED1,Z[6:0]}));
endmodule

4. Division:

RTL: Y = A / B; assume A[15:0], B[5:0], Y[15:0]

Synthesized Gate: Not yet done. FIXME??

-------------------


Difference in DC(design compiler) vs EDI(encounter digital implementation):
-----------------------
1. many of the cmds work on both DC and EDI. Biggest difference is in the way they show o/p. in all the cmds below, if we use tcl set command to set a variable to o/p of any of these cmds, then in DC it contains the actual object while in EDI, it contains a pointer and not the actual object. We have to do a query_objects in EDI to print the object. DC prints the object by using list.

2. Unix cmds don't work directly in EDI, while they do in DC. So, for EDI, we need to have "exec" tcl cmd before the linux cmd, so that it's interpreted by tcl interpreter within EDI.

3. Many new tcl cmd like "lassign", etc don't work in EDI.

4. NOTE: a script written for EDI will always work for DC as it's written as pure tcl cmds.

Design compiler:
---------------------

Register inference: (https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcrmo/dcrmo_8.html?otSearchResultSrc=advSearch&otSearchResultNumber=2&otPageNum=1#CIHHGGGG)
-------
On doing elaborate on a RTL, HDL compiler (PRESTO HDLC for DC) reads in a Verilog or VHDL RTL description of the design, and translates the design into a technology-independent representation (GTECH). During this, all "always @" stmt are looked at for each module.  Mem devices are inferred for flops/latches and "case" stmt are analyzed. After that, top level module is linked, all multiple instances are uniqified (so that each instance has unique module defn), clk-gating/scan and other user supplied directives are looked at. Then pass 1 mapping and then opt are done. unused reg, unused ports, unused modules are removed.

#logic level opt: works on opt GTECH netlist. consists of 2 processes:
A. structuring: subfunctions that can be factored out are optimized. Also, intermediate logic structure and variables are added to design
B. Flattening: comb logic paths are converted to 2 level SOP, and all intermediate logic structure and variables are removed.

This generic netlist has following cells:
1. SEQGEN cells for all flops/latches (i/p=clear, preset, clocked_on, data_in, enable, synch_clear, synch_preset, synch_toggle, synch_enable, o/p= next_state, Q)
2A. ADD_UNS_OP for all unsigned adders/counters comb logic(i/p=A,B, o/p=Z). these can be any bit adders/counters. DC breaks large bit adders/counters into small bit (i.e 8 bit counter may be broken into 2 counters: 6 bit and 2 bit). Note that flops are still implemented as SEQGEN. Only the combinatorial logic of this counter/adder (i.e a+b or a+1) is impl as ADD_UNS_OP, o/p of which feeds into flops.
2B. MULT_UNS_OP for unsigned multiplier/adder?
2C. EQ_UNS_OP for checking unsigned equality b/w two set of bits, GEQ_UNS_OP for greater than or equal (i/p=A,B, o/p=Z). i/p may be any no. of bits but o/p is 1 bit.
3. SELECT_OP for Muxes (i/p=data1, data2, ..., datax, control1, control2, ..., controlx, o/p=Z). May be any no. of i/p,o/p.
4. GTECH_NOT(A,Z), GTECH_BUF, GTECH_TBUF, GTECH_AND2/3/4/5/8(A,B,C,..,Z), GTECH_NAND2/3/4/5/8, GTECH_OR2/3/4/5/8, GTECH_NOR2/3/4/5/8, GTECH_XOR2/3/4, GTECH_XNOR2/3/4, GTECH_MUX*, GTECH_OAI/AOI/OA/AO, GTECH_ADD_AB(Half adder: A,B,S,COUT), GTECH_ADD_ABC(Full adder: A,B,C,S,COUT), GTECH_FD*(D FF with clr/set/scan), GTECH_FJK*(JK FF with clr/set/scan), GTECH_LD*(D Latch with clr), GTECH_LSR0(SR latch), GTECH_ISO*(isolation cells), GTECH_ONE/ZERO, for various cells. DesignWare IP (from synopsys) use these cells in their implementation. NOTE: in DC gtech netlist, we commonly see GTECH gates as NOT, BUF, AND, OR, etc. Flops, latches, adders, mux, etc are rep as cells shown in bullets 1-4 above.
5. All directly instantiated lib components in RTL.
6. If we have designware license, then we also see designware elemnets in netlist. All designware are rep as DW*. For ex: DW adder is DW01_add (n bit width, where n can be passed as defparam or #). Maybe *_UNS_OP above are designware elements.

#gate level opt: works on the generic netlist created by logic level opt to produce a technology-specific netlist. consists of 4 processes:
A. mapping: maps gates from tech lib to gtech netlist. tries to meet timing/area goal.
B. Delay opt: fix delay violations introduced during mapping. does not fix design rule or opt rule violations
C. Design rule fixing: fixes Design rule by inserting buffers or resizing cells. If necessary, it can violate opt rules.
D. Opt rule fixing: fixes opt rule, once the above 3 phases are completed. However, it won't fix these, if it introduces delay or design rule violations.
-------

In GTECH, both registers and latches are represented by a SEQGEN cell, which is a technology-independent model of a sequential element as shown in Figure 8-1. SEQGEN cells have all the possible control and data pins that can be present on a sequential element.

FlipFlop or latch are inferred based on which pins are actually present in SEQGEN cell. Register is a latch or FF. D-Latch is inferred when resulting value of o/p is not specified under all consditions (as in incompletely specified IF or CASE stmt). SR latches and master-slave latches can also be inferred. D-FF is inferred whenever sensitivity list of always block or process includes an edge expression(rising/falling edge of signal). JK FF and Toggle FF can also be inferred.
#_reg is added to the name of the reg from which ff/latch is inferred. (i.e count <= .. implies count_reg as name of the flop/latch)


o/p: Q and QN (for both flop and latch)
i/p:
1. Flop:  clear(asynch_reset), preset(async_preset), next_state(sync data Din),  clocked_on(clk),  data_in(1'b0),           enable(1'b0 or en), synch_clear(1'b0 or sync reset), synch_preset(1'b0 or sync preset), synch_toggle(1'b0 or sync toggle), synch_enable(1'b1)
2. Latch: clear(asynch_reset), preset(async_preset), next_state(1'b0),           clocked_on(1'b0), data_in(async_data Din), enable(clk),       synch_clear(1'b0),                synch_preset(1'b0),                synch_toggle(1'b0),                synch_enable(1'b0)

Ex: Flop in RTL:
always @(posedge clkosc or negedge nreset)
      if (~nreset) Out1 <= 'b0;
      else         Out1 <= Din1;

Flop replaced with SEQGEN in DC netlist: clear is tied to net 0, which is N35. preset=0, since no async preset. data_in=0 since it's not a latch. sync_clear/sync_preset/sync_toggle also 0. synch_enable=1 means it's a flop, so enable if used, is sync with clock. enable=0 as no enable in this logic.
 \**SEQGEN**  Out1_reg ( .clear(N35), .preset(1'b0), .next_state(Din1), .clocked_on(clkosc), .data_in(1'b0), .enable(1'b0), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b1) );

Ex: Latch in RTL
always @(*)
  if (~nreset)  Out1   <= `b0;
  else  if(clk) Out1   <= Din1;     
Latch replaced with SEQGEN in DC netlist: all sync_* signals set to 0 since it's a latch. synch_enable=0 as enable is not sync with clk in a latch. enable=clk since it's a latch.
  \**SEQGEN**  Out1_reg ( .clear(N139), .preset(1'b0), .next_state(1'b0), .clocked_on(1'b0), .data_in(Din1), .enable(clk), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b0) );

NOTE: flop has both enable and clk ports separate. sync_enable is set to 1 for flop (and 0 for latch). That means, lib cells can have Enable and clk integrated into the flop. If we have RTL as shown below, it will generate a warning if there is no flop with integrated enable in the lib.
ex: always @(posedge clk) if (en) Y <= A; //This is a flop with enable signal.
warning by DC: The register 'Y_reg' may not be optimally implemented because of a lack of compatible components with correct clock/enable phase. (OPT-1205). => this will be implemented with Mux and flop as there's no "integrated enable flop" in library.

#Set the following variable in HDL Compiler to generate additional information on inferred registers:
set hdlin_report_inferred_modules verbose

Example 8-1   Inference Report for D FF with sync preset control (for a latch, type changes to latch)
======================================================================
|Register Name | Type |Width | Bus | MB | AR | AS | SR | SS | ST |
==========================================================
| Q_reg         | Flip-flop |   1   |  N   | N    | N   | N  | N    | Y   | N   |
======================================================================
Sequential Cell (Q_reg)
Cell Type: Flip-Flop
Width: 1
Bus: N (since just 1 bit)
Multibit Attribute: N (if it is multi bit ff, i.e each Q_reg[x] is a multi bit reg. in that case, this ff would get mapped to cell in .lib which has ff_bank group)
Clock: CLK (shows name of clk. For -ve edge flop, CLK' is shown as clock)
Async Clear(AR): 0
Async Set(AS): 0
Async Load: 0
Sync Clear(SR): 0
Sync Set(SS): SET (shows name of Sync Set signal)
Sync Toggle(ST): 0
Sync Load: 1

#Flops can have sync reset (there's no concept of sync reset for latches). Design Compiler does not infer synchronous resets for flops by default. It will see sync reset signal as a combo logic, and build combo logic (with AND gate at i/p of flop) to build it. To indicate to the tool that we should use existing flop (with sync reset), use the sync_set_reset Synopsys compiler directive in Verilog/VHDL source files. HDL Compiler then connects these signals to the synch_clear and synch_preset pins on the SEQGEN in order to communicate to the mapper that these are the synchronous control signals and they should be kept as close to the register as possible. If the library has reg with sync set/reset, then these are mapped, else the tool adds extra logic on D i/p pin (adds AND gate) to mimic this behaviour.
ex:  //synopsys sync_set_reset "SET" => this put in RTL inside the module for DFF. This says that pin SET is sync set pin, and SEQGEN cell with clr/set should be used.

#Latches and Flops can have async reset. DC is able to infer async reset for flop (by choosing SEQGEN cell with async clear and preset connected appr), but for latches, it's not able to do it (it chooses SEQGEN cell with async clear/preset tied to 0). This is because it sees clear/preset signal as any other combo signal, and builds combo logic to support it. DC maps SEQGEN cell (with clr/preset tied to 0) to normal latch (with no clr/set) in library, and then adds extra logic to implement async set/reset. It actually adds and gate to D with other pin connected to clr/set, inverter on clr/set pin followed by OR gate (with other pinof OR gate tied to clk). So, basically we lose advantage of having async latch in .lib. To indicate to the tool that we should use existing latch (with async reset), use the async_set_reset Synopsys compiler directive in Verilog/VHDL source files.
ex: //synopsys async_set_reset "SET" => this says pin SET is async set/reset pin, and SEQGEN cell with clr/set should be used.


#stats for case stmt: shows full/parallel for case stmt. auto means it's full/parallel.
A. full case: all possible branches of case stmt are specified. otherwise latch synthesized. non-full cases happen for state machines when states are not multiple of 2^n. In such cases, unused states opt as don't care.
B. parallel case: only one branch of case stmt is active at a time (i.e case items do not overlap). It may happen when case stmt have "x" in the selection, or multiple select signals are active at same time (case (1'b1) sel_a:out=1; sel_b: out=0;). If more than 1 branch active, then priority logic built (sel_a given priority over sel_b), else simple mux synthesized. RTL sim may differ from gate sim, for a non-parallel case.


#The report_design command lists the current default register type specifications (if we used  "set_register_type" directive to set flipflop/latch to something from library) .
dc_shell> report_design
 ...
Flip-Flop Types:
    Default: FFX, FFXHP, FFXLP

#MUX_OPs: listed in report_design. MUXOPs are multiplexers with built in decoders. Faster than SELECT_OPs as SELECT_OPs have decoding logic outside.
ex:
reg [7:0] flipper_ram[255:0]; => 8 bit array of ram from 0 to 255
assign    p1_rd_data_out = flipper_ram[p1_addr_in]; => rd 7 bits out from addr[7:0] of ram. equiv to rd_data[7:0] = ram[addr[7:0] ].
this gives the following statistics for MUX_OPs generated from previous stmt. (MUX_OPs are used to implement indexing into a data variable, using a variable address)

===========================================================
| block name/line  | Inputs | Outputs | # sel inputs | MB |
===========================================================
|  flipper_ram/32  |  256   |    8        |      8           | N  |        => 8 bit o/p (rd_data), 8 bit select (addr[7:0]), 256 i/p (i/p refers to distinct i/p terms that mux is going to choose from, so here there are 256 terms to choose from, no. of bits for each term is already indicated in o/p (8 bit o/p) )
===========================================================

#list_designs: list the names of the designs loaded in memory, all modules are listed here.
#list_designs -show_file : shows the path of all the designs (*.db in main dir)


------------------------

#terminology within Synopsys.  https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcug/dcug_5.html

#designs => ckt desc using verilog HDL or VHDL. Can be at logic level or gate level. can be flat designs or hier designs. It consists of instances(or cells), nets (connects ports to pins and pins to pins), ports(i/o of design) and pins (i/o of cells within a design). It can contain subdesigns and library cells. A reference is a library component or design that can be used as an element in building a larger circuit. A design can contain multiple occurrences of a reference; each occurrence is an instance. The active design (the design being worked on) is called the current design. Most commands are specific to the current design.

#to list the names of the designs loaded in memory
dc_shell> list_designs
a2d_ctrl                digtop (*)              spi   etc => * shows that digtop is the current design

dc_shell> list_designs -show_file => shows memory file name corresponding to each design name
/db/Hawkeye/design1p0/HDL/Synthesis/digtop/digtop.db
digtop (*)
/db/Hawkeye/design1p0/HDL/Synthesis/digtop/clk_rst_gen.db
clk_rst_gen

#The create_design command creates a new design.
dc_shell> create_design my_design => creates new design but contains no design objects. Use the appropriate create commands (such as create_clock, create_cell, or create_port) to add design objects to the new design.

History of Simulators

Verilog-XL (from Gateway design) was the 1st and only verilog simulator available for signoff in early 1990's. Cadence bought it, but ended support at Verilog-1995. It developed it's own compiled code simulator (NcVerilog). Docs from cadence still refer to Verilog-XL when talking about Nc-Verilog. Modern version of NcSim family is IES and recommended for newer projects. However, as of 2018, IES is replaced by  even newer simulator Xcelium. VCS (Verilog Compiled code simulator, 1st SystemVerilog simulator) from Synopsys and ModelSim (ModelTech simulator, 1st VHDL simulator) from Mentor Graphics are the other two qualified for ASIC signoff. All 3 support V2001, VHDL-2002 and SV2005. Modelsim is implemented based on interpreter, so it's much slower compared to VCS and NC-verilog which are based on compilers.

Cadence Simulator: Incisiv Enterprise Simulator (IES) 9.2, verilog-XL (ncverilog) 9.2 from Cadence is the latest simulator (as of 2019). Now as of 2021 Xcelium from Cadence is widely used.

Cadence IES simulator:

Cadence Incisive sim (IES) is based on cadence's interleaved native compiled code arch (INCA is extension of native complied code arch (NCA). With INCA, we can verify multiple languages (verilog, VHDL, SV,  Specman, SystemC, Verilog AMS, VHDL AMS, C , C++, SPICE files, etc), multiple levels (behavioral, rtl, gates), multiple paradigms (event driven, cycle based), mixed signals (digital, analog)) which provides high accuracy with accuracy of event driven simulation (found in interpreted and compiled code tech).
In an NCC simulator, a parser produces an intermediate representation of the input source text. This intermediate representation is then processed by a code generator that produces relocatable machine code that runs directly on the host processor. For example, in a Verilog/VHDL configuration, both the Verilog and VHDL compilers are used to generate code for the Verilog and VHDL portions of the design, respectively. During an elaboration process similar to the linking used in computer programming, the Verilog and VHDL code segments are combined into a single code stream. This single executable is then directly executed by the host processor.
For RTL designs, a min of 64Mb is required while for gate simulation of 150K gates, min of 128Mb mem reqd.

Simulator supports IEEE 1364-2001 std for verilog, OVI 2.0, and verilog XL. System Verilog extensions to verilog as defined in IEEE P1800 std also implemented. We use compiler (ncvlog) and than elaborator (ncelab), which are integrated into IES. When we compile and elaborate a design, all internal rep of cells and views reqd by simulator are contained in single file stored in lib dir. Compiler will automatically create a default work library called worklib in a directory called INCA_libs, which is under the current directory. All design units are compiled into this library.

Cadence Xcelium simulator:

Early simulators processed verilog code in single thread, managing a single active queue of events. This serial methods resulted in significant run time. Xcelium simulator is basically same as IES, except that it can be run in single core or multi core configuration. Multi core configuartion can shorten runtime considerably, by breaking dependency on RTL/gate designs into indep parts, and simulate these parts using independent threads on parallel processors. Xcelium partitions design into accelerated (ACC) and non accelerated (NACC) regions. ACC region contains RTL/gate design, which can be run as parallel threads, while NACC region contains behavioural portions such as testbench, behavioural (model) memories, etc which are run by single core engine. This multi core engine compiler is invoked by passing option "-mcebuild". Compiler will automatically create a default work library called worklib in a directory called xcelium.d, which is under the current directory. All design units are compiled into this library, as well as other libs explained later.

Example of simple design, testbench and testcase:

//simple verilog code that will compile and run: tb.v. To run it, use cmd: irun tb.v
module tb();
 int a;
 initial begin
   $display("a=%d",a);
   //$finish; => this not needed as there's only this file with initial, so nothing is running forever
 end
endmodule

//to run a simple module, create a tb, and change signals at module i/p pins using initial block.
// To run it, use cmd: irun tb.v Top_module.v +access+r -timescale 1ns/1ps => access option needed so that waveforms can be dumped.
module tb(); => brackets optional
 int a;
 reg b,c; //reg neded as wire can't be assigned in always blocks
 
 Top_module I_top (.IN1(b), .IN2(c)); //top module connections => preferred way
 //assign Top_module.IN1 = b; assign Top_module.IN1 = c; => instead of instantiating Top_module as in above line, we can also directly connect pins to nets. NOTE: since IN1,IN2 are nets, "always *" won't work, since it needs regs. so, we use assign.
 initial begin //to apply i/p stimuli and to end sim. Usually this whole block is placed in tc_1.v file, so that we can apply diff stimuli for each testcase
   #100 b=1'b1; #200 c=1'b0;
   $display("b=%d, c=%d",b,c);
   $finish; => this should be last stmt as after this stmt, tool exits
 end

//dump waveform in vcd format. To dump fsdb (novas proprietary format, but used by almost all vendors), we need other system task defined later.
 initial begin //to dump vcd files for all modules. Does not matter in which module it's placed, it still dumps for all modules.
   $dumpvars;
   $dumpfile("tmp.vcd");
   $dumpoff;
   #3150us; //dump vcd starting from 3150us
   $dumpon;
   #600us; //end dump at 3750us
   $dumpoff;
end

initial begin //other way to dump
   #1000; //start of dump
   $dumpvars;
   $dumpfile("/sim/ACE/.../tmp.vcd");
   #2000;
   $dumpflush; //end of dump
end

endmodule


Running simulator: 2 ways.

  1. Multi-step: First compile (different compilers for diff src files), then elaborate then run simulator. Here all these steps are run separately. Not recommended.
    1. Compiler: We have different compilers for VHDL and Verilog. ncvhdl is VHDL compiler, while ncvlog is Verilog compiler.
      • ncvhdl cmd: ncvhdl vhdl_src_files => ncvhdl is VHDL compiler. run ncvhdl -help to get other options
        • ex: ncvhdl -V200X -messages -smartorder a.vhd b.vhd => enables V1993 and V2001 features (use -V93 to enable only VHDL 1993 features), print informative msg, and compile in order independent mode
      • ncvlog cmd: ncvlog verilog_src_files => analyzes and compiles verilog src. performs syntax check on HDL design and generates intermediate representation, in lib database file called inca.architecture.lib_version.pak (architecture=lnx86)
    2. Elaborator: Elaborates the design. ncelab is the elaborator provided by Cadence that elaborates the design compiled by compiler above.
      • ncelab cmd:  ncelab top_level_design_unit => elaborator takes lib cell:view of top level as i/p, and constructs design hier, establishes connectivity, and computes the initial values for all of the objects in the design. It creates a m/c code and snapshot where access level is no rd,wrt or connectivity access to simulation objects, That means we won't be able to probe these objects outside of HDL which is OK in regression mode, but we need to set it to rd access in debug mode.
    3. Simulator: Simulates the design using the test case or patterns provided.
      • ncsim cmd: ncsim snapshot_name => The simulator loads the snapshot generated by the elaborator, as well as other objects that the compiler and elaborator generate that are referenced by the snapshot. The simulator may also load HDL source files, script files, and other data files as needed.
        • ex: ncsim -run worklib.top:module => NOTE: Using -gui option with ncsim starts simVision. That brings up Design browser and Console. Then we can run ncsim cmds on the Console.
  2. Single step: Here all the steps from above are run as part of one cmd. This is much more convenient. There are 3 different variants here depending on the simulator that you have from Cadence. Either we use ncverilog or use irun/xrun (irun for IES and xrun for xcelium). NcVerilog is run in single step by using ncverilog on cmd line. irun/xrun is very similar to ncverilog, but in addition to verilog/system verilog, it can also accept vhdl, systemC, AMS, etc. Since irun/xrun run all steps, that have a lot more options each of which are specific to the tool that is being invoked. So, we should refer to those tools (i.e ncelab, xmsim, etc) for the specific options that are supported. irun/xrun are not case dependent (i.e -nolog same as -NoLoG). Also, short version of cmd line options allowed (i.e -nowarn same as -now, various options support varying num of min char required for that option to be recognized in it's short form)
    1. ncverilog: ncverilog does what multi step simulation does by invoking ncvlog, ncelab and ncsim for you. It lets us run NC-verilog simulator exactly the same way that we ran Verilog-XL (verilog-XL was run using cmd "verilog" on cmd line). All cmd line args are same as those of verilog-XL. On top of this, ncverilog also allows us to include ncvlog, ncelab and ncsim options on cmd line in form of + options. It also suppports manymore + options than verilog-XL.
    2. irun: It's for use with IES simulator. specifies all files on single cmd line. In ex below, top.v and sub.v are compiled by ncvlog using option -ieee1364, middle.vhdl is compiled by ncvhdl using option -v93, verify.e is recognized as specman e file and compiled using sn_compile.sh. After compiling all these, ncelab elaborates design using -access +r option (to provide rd access to simulation object, else in vcd/fsdb dump file, we won't see all wires,reg,etc) and generates sim snapshot. ncsim is then invoked with both SimVision (comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc) and Specview gui.
      • ex: irun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v
      • ex: irun a.v b.v top.v tb.v => simplest cmd to run all rtl and tb files
    3. xrun: very similar to irun. It's for use with Xcelium simulator. However, compilers here are xmvlog, xmvhdl, sn_compile.sh. xmelab elaborates design, while xmsim simulates the design (xm means xcelium, while nc meant ncverilog which was used earlier in IES). xrun uses xmsc_run compiler i/f to compile c/c++ files. These compiled files, along with any other object files provided on cmd line, are then linked into single dynamic library, that is then automatically loaded before elaboration of design.
      • ex: xrun -ieee1364 -v93 +access+r +neg_tchk -gui verify.e top.v middle.vhd sub.v => NOTE: how all args are same as those of irun

 

Sequence of steps when running the Simulator:


NOTE: Both irun and ncverilog finally run ncsim which runs simulation cmds. Using -gui option brings up SimVision on ncsim cmd prompt. When running irun/ncverilog, this is what appears on screen:


1. ncvlog/ncvhdl: analyzes and compiles each source file. => done only when any file changes, else it's skipped
ex:     file: ../models/CFILTER.v
        module worklib.CFILTER:v
                errors: 0, warnings: 0


2. ncelab: elaborates files and constructs design hier from top level design units. It auto figures out top level design units based on if they are referenced elsewhere. Usually digtop_tb and testcase_name_tc are top level design units as they aren't referenced anywhere else. Then it generates native compiled code for each module and then provides design hier summary.  It finally writes the simulation snapshot, which is a file that has all info for sim to run on it (w/o needing any info from anywhere else). elaboration step is run only when any file changes, else it's skipped
ex:   Elaborating the design hierarchy:
        Top level design units:
                digtop_tb
                S1_main_hunt_tc
        Building instance overlay tables: .................... Done
        Generating native compiled code:
                S1.AFE_AGC_S1:v <0x17bc2126>
                        streams:  28, words: 11022 < and so on for each module ....>
        Building instance specific data structures.   
        Loading native compiled code:     .................... Done
        Design hierarchy summary:   
                             Instances  Unique
                Modules:         1       1     
                Registers:       3       3  
                Initial blocks:  1       1
        Writing initial simulation snapshot: worklib.tb:sv   
Loading snapshot worklib.tb:sv .................... Done        
     
3. ncsim: loads the snapshot generated above and runs ncsim. ncsim prompt appears. It first source ncsimrc file (this file is needed by ncsim for displaying rc files). Then it puts "run" cmd, and then on encountering $finish in any module or on reaching end of all "initial" and having no "always" or other infinite loops, it puts "exit" cmd to exit ncsim.
ex: ncsim> source /apps/cds/incisiv/12.20.018p2/tools/inca/files/ncsimrc => this file aliases run as "." and exit as "quit", so that . will also work instead of run, and quit will also work instead of exit.
    ncsim> run .... (displays stmt which have $display ...)
    ncsim> exit

----------------

NOTE: In verilog-XL(ncverilog) and irun, many cmds in ncvlog, ncelab and ncsim which are preceeded by "-" are replaced by +.
ex: ncvlog -define arg1 => in ncverilog/irun, it's irun +define+arg1

Help:
>irun -helphelp
>irun -helpall

NOTE: to get help on any error that we see on running irun, we can type this:
Ex: error ncelab: *E,CUVRFA: blah ... shows up. To get more info type: nchelp ncelab CUVRFA
Ex: If error happened in ncvlog, type: nchelp ncvlog CUVRFA

 


 

RTL and Gate Simulation setup:

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/
3 subdir:
--------
tb: testbench dir. It has top level tb file (digtop_tb.v). digtop_tb.v defines a top level module digtop_tb, includes file all_tasks.v & xfilter.v, does initial begin .. end, and then instantiates module digtop and calls this dut, and connects all tb_* signals to appropriate digtop pins.

tc: testcase dir. It has test cases for different tests. i.e for interrupt block, it has interrupt_tc.v. Remember, any signal that you specify in tc should be an i/o port of a module or block, as internal net names may get renamed in gate synthesis, so even though the testcase may run on RTL, it'll fail to run on gate netlist.

sims: This is the main dir to run gatesims or RTL sims.

RTL:
-----
Build RTL dir:

run_rtl_sims (verilog) => script to run verilog RTL sims
----------------------
#we need to be able to run debussy to debug, so we provide a link to provided compiled lib from Debussy (if PLI app from Debussy has already been compiled into dynamic shared lib as is the case here) to provide bootstrap dynamic linking. Then user defined bootstrap fn can be accessed using load* (loadpli1, loadvpi, etc) in irun (or Nc simulator). This PLI defines functions such as $fsdbdumpvars and $fsdbdumpfile, which are needed for dumping fsdb files (note functions for vcd dump don't require this PLI, since they are supported by default by all simulators).
#for linux OS
set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
#for SOLARIS OS
#set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v20/share/PLI/nc_xl/SOL2/xl_shared/libpli.so:deb_PLIPtr"

irun -9.20.039-ius \ => specifying version of irun is optional. default is chosen based on ame if nothing specified. (running "irun -version" returns the version of irun being used)
$DEBUSSY_PLI \ => loads debussy PLI
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \ => -y for dir. All gate verilog included incase we've any stdcells instantiated in RTL (usually clk gaters and mux/logic on clk/reset are hard instantiated)
#+incdir+../../tb/ \ => incdir option is used when we have `include "file1" in some other verilog file2. Then we have to include whole dir where file1 resides, else while compiling file2, we'll get an error about file1 not found. We don't need to compile file1 as `include will cause file1 contents to be included in file2. Note that if we try to compile file1, it may not compile as any verilog file to be compiled needs to have proper syntax (i.e file should have "module", "endmodule", etc. Many times in such include files we just have some verilog stmts, which is fine as these are just inluded in main file2 which already has module etc).
-y /db/Hawkeye/design1p0/HDL/Source/golden \ => instead of this, we could also use "-f rtl_files.f" which would have paths for each RTL file to be included
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-coverage ALL -covdut digtop -covoverwrite -covworkdir ./coverage/cov_$1 => puts coverage results in dir "/coverag/cov_$1/". says top level dut used for coverage should be "digtop" instance (we can also limit coverage to particular sub-module by using hier path for that instance(Not defn of module but instance of module)". It generates binary coverage data files (UCD) and coverage model files (UCM). coverage types can be code(block, expr, fsm, toggl) or functional(assertion, covergroup). "all" enables all code coverage types listed (B=>Block, E=> expression, F=>FSM, T=>Toggle, U=> fUnctional, A=>all. ex: we can wrt "-coverage BEFT" to enable all code coverage).
#NOTE: instead of using coverage cmds, we can also pass a .ccf cfg file which can have all cmds in there. i.e -covfile config.ccf. sample coverage.ccf file
select_coverage -all -module * => selects all coverage
set_libcell_scoring => IMP: sometimes we get no coverage results. Reason is coverage stops at libcells. Sometimes all modules treated as libcells whenever irun calls source dir with -y option (-y option is usually used with libcell dir). So, this "set_libcell_scoring" option forces coverage to be reported for all libcells too.

-l ./rtl_logs/$argv[1].log \ => -l (small letter L) is to specify logfile instead of default irun.log. We can also use /$1.log (as $1 and $argv[1] are same)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \ => rd access so that all wires,reg etc can be accessed in vcd/fsdb files
+libext+.v \ => specifies extension of files referenced by -y option (+libext+extension). If this option not used, then files referenced by -y should not have file extension, else they will be ignored (very imp to use this wuth -y)
+licq \
#+sv \ => with -sv option, all verilog type files are compiled as SystemVerilog.
+notimingchecks \ => do not execute timing checks for $setup, $recrem, etc
-input dump.tcl \ => optional. needed for shm db dump. see in simvision section below for more details
+define+TI_functiononly \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/rtl/$argv[1].fsdb\\\" \ => important to have \\\ before "
+define+FSDB \
+define+IMGFILE=\\\"/sim/.../a.img\\\" \ => this can be used in tb.v file or any other verilog file, to assign value from cmdline. i.e
 `ifdef IMGFILE defparam tb.block1.PRELOADFILE=`IMGFILE; `endif
-svseed random \ => assigns random seed to all $urandom fn
+nctimescale+1ns/1ps => default timescale to use if no timescale defined anywhere

#-work: by default, irun compiles all design units in HDL files in work library called worklib (located within INCA_libs dir). We can change work lib name by using -work.
#dir structure is:
INCA_libs/irun.nc/xllibs/models,golden => for models dir, golden dir, etc specified with -y above stored in xllibs
INCA_libs/worklib/.inca*db, inca*pak   => contains all compiled units as one file in .pak lib database. within worklib dir, we have subdir for std,ieee,worklib,synopsys,etc which have their own .pak database.

#-linedebug: to get debugging info

run_rtl_sims (mixed: tb is in verilog but src files are in vhdl/verilog)
----------------------
remains same as above (i.e same as running verilog rtl sims)
The only difference is that novas fsdb dump doesn't work on vhdl src files (i.e it only shows signals for verilog files in waveform, but not for vhdl files). Option is to dump vcd file, as vcd file will always have all signals. Other option is to set DEBUSSY_PLI to newer version of novas (in run_rtl_sims file) as follows: (doesn't seem to work ?)
DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/2010.04/share/PLI/IUS/LINUX/boot/debpli:novas_pli_boot"

run vhdl rtl sims: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/sims/run_rtl_sims
-----------------
#for fsdb dump
In tb/tb_spi.vhd file, put "use WORK.novas.all;" at the top before entity declaration, and also add this directive:
process
begin
`ifdef FSDB
        fsdbDumpvars(0,":");
        fsdbDumpfile("test.fsdb");
`endif
end process;

#above code, always dumps fsdb file as dump.fsdb in current dir. So, we can instead run this to dump into specific file:
#create file nc.do and then call this file from irun cmd line by adding this option: -input nc.do \
call fsdbDumpfile /sim/HAWKEYE_DS/kagrawal/digtop/rtl/SPI.fsdb
call fsdbDumpvars 0 :
run => if we don't add this line, then ncsim stops at cmd prompt, and we have to type run on the prompt to continue

-----------
#run_rtl_sims (vhdl):
#LD_LIBRARAY_PATH needs to be set
#solaris
#setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/SOL2:$LD_LIBRARY_PATH
#linux
setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH

#note, here we specified debussy_pli with path separately defined above, while for verilog, it was all in one line.
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot"
#we may also add -loadcfc option above, to get rid of some system errors:
#set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

#irun (same as for verilog, except -top,relax,V93 options used)
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc7/rev1/diglib/msl270/r3.0.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
#/apps/novas/debussy/2011.01/share/PLI/IUS/LINUX/boot/novas.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_typedefs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_control.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi_regs.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Source/spi.vhd \
/db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/tb/tb_spi.vhd \ =>
-top E \ => for vhdl, top entity has to be declared (this top entity is in tb/tb_spi.vhd)
-relax \ => to relax strict vhdl requirements
-V93 \ => since our vhdl is 1993 format
-input nc.do \ => use this, if we call fsdb cmd in nc.do isntead of fsdb cmd in tb_spi.vhd
-l ./rtl_logs/$argv[1].log  ... => other options same as those for verilog

-------
#for vhdl and SystemC files, you have to specify the top level with -top option, as simulator does not automatically calculate top-level VHDL/SystemC design units. However, with this option, autodetection of top level verilog modules is disabled. (-vhdltop and -sctop specifies VHDL top level and Sc top level, but doesn't disable auto calculation of verilog top level units)
-top [lib].cell[:view] => specifies top level unit, can use multiple -top options to specify multiple top-level units
Ex: -top E \ => entity E is defined in top level testbench file tb_spi.vhd, which calls top level source entity spi.

#for vhdl, IEEE 1076 standard does not allow for multiple choices (i.e. 0=>'1', OTHERS=>'0') in an array aggregate that is not locally static (i.e. VECTOR(size-1 downto 0) has a variable range). If you make the range of the array static (e.g. VECTOR(3 downto 0) or provide only one choice (e.g. OTHERS=>'0'), then the code will compile correctly. Cadence has adjusted ncvhdl with a switch named '-relax' which relaxes a variety of LRM rules, and alows code to compile.
-relax \
#we can also use option -V93 to force irun to compile with VHDL93 syntax.

GATE:
----
gate sims run on gate level netlist, which has all nets as "wire". If there's a net which is i/o port of module,  it has to be connected through a "wire" at higher level to another i/o port of some module, or to i/o port of top level module. All these "wire" have parasitics associated with them in spef file, and hence delays associated with them in sdf file. Some nets appaer as "wire", but during optimization, they are not used for connections (like instead of Q pin of flop, QZ pin is used sometimes, which results in net associated with Q pin to be floating). such nets even though listed as "wire" don't have any parasitics and are reported as "unannotated nets" during sdf file generation (in PT).

We do timing checks when running gate sims. This may cause non-convergence in simulator for cases where there are -ve setup/hold times or -ve rec/rem values in sdf file. see in verilog.txt.

--------------------------------
GateSim (for verilog testbench):
--------------------------------
For gatesims, we do xfiltering for meta flops, and we do sdf annotation for all nets/cells. We add this in digtop_tb.v in b/w "module ... endmodule", whenever SDF_MAX or SDF_MIN is defined.
digtop_tb.v:
1A. xfilter: include "../tb/xfilter.v" => In this file, we define Xon parameter for all meta flops to be 0. On doing this, setup/hold check is turned off for this flop, so that we don't see these warnings: "Warning!  Timing violation $setuphold<setup> ( posedge CLKIN:65071 PS, posedge EN:65077 PS,  0.248 : 248 PS,  0.041 : 41 PS );... Time: 6548 PS" for that flop. Here, numbers shown are setup of 248ps(min:typ:max), and hold of 41ps(min:typ:max). When only 2 values shown instead of triplet, that means sdf file had only 2 values. Here CLK and EN comes within 6ps (65077-65071) causing a viol.
ex: defparam testbench.Idigtop.Ideglitch.mota_itrip_deg.sig_meta_reg.Xon = 0; => This Xon parameter = 1 in model of flop (in ifdef TI_verilog section of DTCD2.v flop). So, by default X is propagated, but if we set Xon=0, then X is not propagated. X value in that meta flop is forced to whatever RTL is modeling. That means whatever is the i/p of flop right before the clk edge is passed. If the i/p changes right on the clk edge, then the coding sequence determines which happens first, i/p change or clk edge. If we don't set Xon=0, then X's will get propgated to all logic eventually, and all our test cases will fail. By setting Xon=0, we force o/p of flop to be 0 or 1 always.
 => Next, in filtered_logs dir, we copy all log files from gate_logs dir, and search for any "Warning" msg using filter_warnings.pl script. We should not see any warnings as meta flops are the only ones that should have setup/hold viol. Any other viol is real, and should be fixed in design. Since we were timing clean, we should investigate if we had mistakenly set that path to a false path in PT/ETS.

1B. instead of xfilter.v file, we can also turn off timing check by using tcheck cmd by specifying it on irun cmd. (valid for irun versions 14.2 or later)
    +nctfile+gate.tfile => arg to irun (no space in b/w "+")
   ex: In gate.tfile, we put 1st sync flop for all synchronizers to be filtered out for x propagation. This also prevents tool from generating "Warning! Timing violation $setuphold ...". option 1A above may still generate warnings depending on library model written.
       PATH tb_digtop.dut.sync_*.genblk1_S_sync1 -tcheck => turns off timing check for flop genblk1_S_sync1. Not sure, if it turns off all timing checks or just setup/hold.
   NOTE: if running older version of irun, then the tool doesn't pick up thse tcheck and will throw this warning "ncelab: *W,TFANOTU (gate.tfile) tfile node ... was not used by design". This means tool discarded the tcheck, due to old version, etc, etc.

1C. we can also provide timing check file via "-input tcheck_off.tcl", which will have " tcheck -off" cmd for 1st stage of all sync flops.
ex: tcheck -off veridian_tb...i_sync_flops.u_sync.tiboxv_sync_2s_acn_sync_0

2. sdf_annotation: $sdf_annotate( .... ) for both max/min. see in sdf annotator section below.

Dir: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay
----------------------
#same as run_rtl_sims except netlist is gate level, neg_tchk, max_delays, define+TI_verilog used

set DEBUSSY_PLI     = "+loadpli1=/apps/novas/debussy/5.2_v21/share/PLI/nc_xl/LINUX/xl_shared/libpli.so:deb_PLIPtr"
irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
../../../Source/global.v \
../../../FinalFiles/digtop/digtop_final_route.v \ => gate netlist
../tb/digtop_tb.v \
../tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
+sv \
+neg_tchk \  => allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation. This is needed, bacuse tools zero out -ve timing check numbers, as it may not converge and causes large performance issues. see in verilog.txt for more info on -ve timing checks.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
-input dump_gate.tcl => optional. same format as for rtl sims.
-SDF_CMD_FILE sdf_max.cmd => optional. see sdf section below for details.
+nctfile+ gate.tfile => optional. turns off timing checks for specified gates. see above section for details.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
#+define+VCD \
+define+VCDFILE=\\\"/sim/NOZOMI_NEXT_OA/kagrawal/digtop/gate/$argv[1]_max.vcd\\\" \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

NOTE: after running gatesims with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated.

----------------------------
GateSim (for vhdl testbench):
----------------------------
Dir: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/kagrawal/gatesims
run_gate_sims_max => script to run gatesims for max delay

#sdf compiled file generation (see below in sdf annotation)
ncsdfc /db/MOTGEMINI_DS/design1p0/HDL/FinalFiles/digtop/digtop_max.pt.sdf -output ./digtop_max.pt.sdf.X

setenv LD_LIBRARY_PATH /apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX:$LD_LIBRARY_PATH
set DEBUSSY_PLI     = "-loadpli1 debpli:novas_pli_boot -loadcfc debcfc:novas_cfc_boot"

irun \
$DEBUSSY_PLI \
-y /db/pdk/lbc8/rev1/diglib/pml30/r2.5.0/verilog/models \
/apps/novas/debussy/5.2_v21/share/PLI/nc_vhdl/LINUX/novas.vhd \
/db/Hawkeye/design1p0/HDL/Source/golden/global.v \
/db/Hawkeye/design1p0/HDL/FinalFiles/digtop/digtop_final_route.v \ => gate netlist
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v \
/db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tc/$argv[1]_tc.v \
-l ./gate_logs/$argv[1]_max.log \
-input nc_max.do \ => look above in rtl sim for vhdl (it calls fsdb dump functions)
+nclibdirname+"$argv[1]_INCA_libs" \
+access+r \
+libext+.v \
+licq \
#+sv \
+neg_tchk \ =>allows neg values in $setuphold and $recovery timing checks in the Verilog description and in SETUPHOLD and RECREM timing checks in SDF annotation.
+max_delays \ => Apply the maximum delay value if a timing triplet in the form min:typ:max is provided in the Verilog
description or in the SDF annotation.
+define+TI_verilog \ => TI_verilog uses models with delays
+define+FSDB \
+define+FSDBFILE=\\\"/sim/HAWKEYE_DS/kagrawal/digtop/gate/$argv[1]_max.fsdb\\\" \
+define+GATE \
+define+SDF_MAX \ => SDF_MAX annotation used in top level module
+nowarnCUVWSP \
+nctimescale+1ns/1ps

run_gate_sims_min => script to run gatesims for min delay.
----------------------
same as for max, except +min_delays, +define+SDF_MIN used.

 




Waveform viewer and debugging system:

Many waveform viewer available to view the results of simulation. Some popular ones are as below:

  1. SimVision from Cadence: comprehensive debug env which includes design browser, waveform viewer, src code browser, signal flow browser,etc.  It uses *.shm waveform database to store waveforms. Expensive license ($50K)
  2. Debussy from Novas (purchased by SpringSoft in 2008): The Knowledge-Based Debugging System. debussy is cheaper ($5K), but its superset Verdi is used, which is behaviour based debugger. It uses fsdb and vcd waveform database. All the cmds of Debussy are valid in Verdi. Debussy invoked by typing: debussy -f <cmd_file>. We use debussy, Release 2008.10 , Linux x86_64/64bit. (though it says it's using verdi 2008.10 version with 64 bits).
  3. Verdi from Novas (Verdi was a product of Novas, but was purchased by Synopsys): Verdi is superset of Debussy, costs more but has lot more features. invoked by typing: verdi -f <cmd_file>. Verdi is the recommended tool to use (instead of debussy).

All these waveform viewers need waveform in some format to display it. Two most common waveforms supported are as below.

  1. VCD: (value change dump), ASCII  format for waveform dumpfiles. defined by IEEE std 1364-2001 and supports 6 value VCD format (orig 4 valued logic: 0,1,Z,X and later signal strength and direction added). widely used. The VCD file comprises a header section with date, simulator, and timescale information; a variable definition section; and a value change section, in that order.
  2. FSDB: (fast signal database), which is Novas' proprietary waveform dump format.  It is much more compressed than the standard VCD format generated by most simulators.  Novas provides a set of object files (using +loadpli) that link with all common commercial simulators to generate an FSDB file directly.



SimVision:
--------
Using -gui option with ncsim or irun/ncverilog brings up SimVision.
> irun -gui -f run.f -access RWC -linedebug (add "-uvmlinedebug" if running with uvm)
NOTE: "-access +r" or "-access RWC" is needed, else waveform dump won't show any signals (as they don't have read permission, r=read, w=write, c=connectivity to help with x propagation). Also, ncsim cmds for dumping waveform into cadence database (waves.shm) is needed in input script or on ncsim prompt. See below for details.

We can also directly type simvision to bring up simvision. We can then open "waves.shm" database.
simvision &
simvision -waves waves.shm -input digtop.svcf & => This will open up waves.shm database, with signal file digtop.svcf (similar to rc file in nWave). We can do "File->Source command script" to load svcf file or "save command script" to save svcf file.
 
Simvsion has a Design browser and Console.
1. Design Browser-simvsion: It allows to browse design. It shows modules, RTL, etc. NOTE: If we select signals on this, it won't show up in waveform window automatically. We have to do "send to waveform" to see them on waveform viewer.

2A. waveform-simvsion: To invoke waveform viewer, click on "send to waveform" on design browser of simvision (It's 2nd button after + sign on top right side). Imp cmds:
send to: => used to send values from waveform to RTL or schematic and vice versa
= => this zooms to fit waveform

2B. On waveform, to see delta time delay: take mouse to "yellow pulse shape", hold right click for a second, and a pop up comes. choose "expand time"->All_time. Then on waveform we see blue shaded area. The blue area shows what happens in delta delay time (you will see that time remains same in blue area, but numbers in brackets change implying delta delay)

3. Console-simvision: It's used to run ncsim cmds. It has ncsim prompt on simulator tab (It has 2 tabs on bottom: Simvsion and simulator). When we write "run" cmd on it, that it when it starts running sims. When we are not in Simvision gui mode, then run is automatically placed on ncsim cmd prompt, so that our simulation runs to completion. Then when completed, exit is automatically placed on ncsim cmd prompt to exit sim. If we want to stop sim when in cmd line mode, we can add "-tcl" to cmd line, and then tool will stop at ncsim prompt. We'll have to type "run" on ncsim prompt to continue. ex: irun -tcl -f run.f (stops at ncsim prompt)
Ex of ncsim cmd:
ncsim > database -event -open waves -into waves.shm => create shm database named waves.shm (which contains .dsn and .trn files, which are waveform dump). waves is the scope. "-event" provides zero time events to be seen on any signal, which is otherwise not possible to see. This helps detect edges happening with 0 width)
ncsim > probe -create -all -depth all -tasks -functions -memories -database waves -name probe_a => probe all signals, all depth and for all tasks,functions too. It does not probe memories (2-d,3-d array), so have to put -memories also.(also, if we run gui mode, w/o using -tcl, then memories are automatically added to probe). Put this probe data into database waves. If no name is provided for probe, then ncsim will name it probe 1, probe 2, etc. NOTE: in design browser, select Scope as "waves", and then you will all signals with values. By default, scope is "all available data" which shows simulator scope also (which may not have any probe data).

NOTE:To get extended vcd (which shows port dirn too), do this: (evcd needed to generate tdl files)
ncsim> database -open waves -evcd -into myvcd.vcde
ncsim> probe -create testbench.dut -evcd -database waves
Instead of above 2 cmds, we can also d this in Tb.sv file: initial $dumpports(UVMTb.I_dut, "sim.vcde");

nsicm > run => runs ncsim till it terminates. pgm terminates when $finish is reached in any module.
ncsim > run 2.5 ms => runs ncsim for 2.5ms
ncsim > exit => exits ncsim.
ncsim> reset => resets ncsim, so that we can run simulation again starting from time 0
NOTE: To rerun new rtl after modification, we can either close simvsion and rerun simulation again or from Console window we can click Simulation->Reinvoke Simulator. This reruns new rtl and loads new waveform.

NOTE: we can provide -input option with irun, specifying the input file, which gets loaded on ncsim prompt. This saves up from manually typing the ncsim cmds on cmd line. If we don't provide cmd for "database -open .." or "probe -create ...", then no cadence datanase is created. To create vcd/fsdb database, we have to provide system task "$dumpvars .." within "initial begin ... end" block to dump waveform database.
Ex: irun -access +r -f rtl_files.f -input dump.tcl .... => -access +r is needed to see signals in waveform dump
dump.tcl has these lines:
database -open waves -into /sim/bellatrix/kagrawal/waves.shm -default
probe -create -emptyok -database waves -all -memories -depth 10 digtop_tb => var in function/task not dumped by default. To dump those, use -variables.
probe -create -emptyok -database waves -all           -depth 3  Silver_top.Xosc.I1 => This type of probe used for ams sims to dump voltages upto 3 levels deep
probe -create -emptyok -database waves             -flow -ports Silver_top.Xosc.AVDD => This probes current at AVDD port of Xosc block. valid for ams sims, since digitl blocks (which are modeled as verilog) do not consume any current.
probe -create -emptyok -database waves -all -flow     -depth 3  Silver_top.Xosc.I1 => This probes current for all nets upto 3 levels deep.
probe -create -emptyok -database waves -all -memories -depth 10 -domain digital => This is helpful in ams sims, where we do not need to specify path of digital block. It does probing upto 10 level deep of all nodes which are digital in nature (i.e have verilog models)
run
quit => this is executed after run has finished

----------

Xcelium (xrun):

--------

As discussed earlier, xrun is used to run designs on Xcelium Simulator. It does work similar to irun. All of the options for xrun same as those for irun. 2 imp help cmds for xrun:

> xrun -helpshowsubject => shows list of subjects as xmvlog, xmvhdl, xmelab, xmsim, etc

> xrun -helpsubject xmvlog => shows all options for subject xmvlog, as -assert, -ams, etc

> xrun -helpall -helpalias => -helpall displays list of every supported option, while -helpalias displays different ways to enter an option (ones entered using -/+ signs. irun/xrun use both "-" and "+" for cmd line options)

ex: xrun top.v test.c obj1.so -y ./libs -y ./models -l run1.log ... (source files can be in any format as .v, sv, .vhd, .e, .vams, .c, .cpp, .s, .o, .so, etc)

This is how dir looks like, when you run xrun: ex: 

xcelium.d => instead of INCA_libs, this build dir created. Contents in this dir are automatically checked (timestamp, snashot info, etc) on rerun of xrun, to determine if recompilation or re-elaboration is needed. It has following subdir:

1. xcelium.d/run.<platform>.<xrun_version>.d (ex: xcelium.d/test_sim.lnx8664.19.01.d, instead of run, we created test_sim as custom name by using option -snapshot test_sim ). A soft link names test_sim.d is created by default pointing to this dir. Within this are subdir, is xllibs dir, which has subdir for each -y libraries and -v library files (i.e run.d/xllibs/<libs> and run.d/xllibs/<models> when cmd is "xrun top.v -y ./libs -y ./models ... ")

2. worklib => design files contained in HDL design files (as in top.v) are compiled in this dir. Usih option "-work <worklib_name>" changes name of this worklib dir. Within this dir is library database file called "xlm.lnx8664.066.pak" file, which stores all intermediate objects required by Xcelium core tools. These .pak files are large and so usually compressed by using -zlib option

3. history => There is history file which records all prev cmds run

options:

-64/-64bit => runs 64bit version of xrun

-top chipTb => defines top level module (can have multiple such cmds since there are typically multiple top level modules from uvm, design, etc). This option not needed for v/sv top level modules, but required for vhdl/systemC top level modules. By default, top level design units are automatically determined for v/sv, but are not automatically inferred for vhdl/systemC if top units are in these files. In such cases, this option is required

-l <logfile> => by default, log is written to xrun.log in same dir where xrun was invoked

-v libfile.v => old scheme of lib mgmt. xrun scans this file for module/udp defn that can't be resolved in normal src files specified. -v option causes module/udp in these files to be parsed, only if they have the same name as unresolved module/udp. Otherwise they are not parsed, which saves time. If we omit -v, then these module/udp in these files will always be parsed

-y <lib_dir> => specifies path to library dir, where files containing defn of module/udp are to be found

-define foo=2 => -define similar to using `define compiler directive in verilog. same as irun, can use +define+ also. If there's no value to assign, we can also do "-define foo".

-compile => parse and compile source files, but do not elaborate

-elaborate => parse and compile source files, elaboarte design and generate simulation snapshot but do not simulate. If -compile/-elaborate options not used, then all steps run (compile/elaborate/simulate)

-hal => this runs HAL (HDL analysis) on snapshot instead of running simulator. This is used to verify any errors/warninhgs etc on design files.

-snapshot <snapshot_name> => genrate sim snapshot with given name (-name or -snapshot are both same) in xcelium.d/worklib/<snapshot_name/*. By default, snapshot name are xcelium.d/worklib/run/*. This option also changes name of xcelium.d/run.lnx8664.19.01.d to xcelium.d/<snapshot_name>.lnx8664.19.01.d.

-r <sanpshot_name> => load and simulate specified snapshot, w/o doing any kind of checking. By providing "-input file1.tcl", we can provide diff tcl cmd i/p files to have multiple diff sims with same snapshot. -R (w/o any snapshot name) is used to simulate the last snapshot generated by xrun cmd.

-xmlibdirname <xcelime_dirname> to have custome dir name instead of xcelium.d. When running simulator only (using -r or -R option), we need to provide this, if snapshot is not in default dir path or default name.

-clean => this forces removal of dir xmlibdirname or xcelium.d and start fresh. This causes xrun to recompile, re-elaborate and recreate dir. In absence of this option, automatic checks are done to edtermine if this dir can be reused

-hdlvar /home/.../my_hdl.var => This var file is a configuration file that can have all cmd line options and args in 1 place (i.e DEFINE XRUNOPTS -ieee1364  -access +rw etc) . That way, the regular xrun cmd won't look lengthy and complex

-f <args_file> => We can also provide additional argument file that can have any args in it, name of source file, and everything else needed with xrun, which will be added to xrun existing args (i.e -clean source.v ...)

uvm cmd line options supported by xrun:

-uvm => enable support for uvm

-uvmhome /UVM/.../uvm-1.2 => specifies loc of uvm installation. By default, uvm is installed in <install_dir>/tools/methodology

-uvmexthome .../CDNS-1.2 => loc of cadence extensions to uvm. By default, uvm extensions are installed in <uvmhome>/additions/sv

. run_test() task in top level module calls this test to run+UVM_TESTNAME=<test_name> => specify name of test

 

 


Debussy:
---------
used to see waveform demp, and annotate it to rtl/gate so that debug is easier. It is also used to see schematic rep of rtl or gate, which helps to see connectivity. Gate schematic specially helps during ECO as we don't have to manually go thru verilog text file of digtop_final_route.v.
Debussy has following tools as part of the suite.

nTrace:
-------
gui that comes up to traverse design hier.can trace load, driver, connectivity. can change src code by choosing ur editor: tools->preferences->editor, and then choosing source->edit source file.
to import design, goto file->import design. Select "from file", set Virtual Top as "digtop", default dir as "/db/Hawkeye/.../FinalFiles/digtop", then in bottom LHS panel, goto dir "/db/Hawkeye/.../FinalFiles/digtop", then click on synthesized netlist "digtop_final_route.v" in RHS, and click Add. Then it shows up in design Files. Click OK. Now, you can see whole netlist in the top panel
active annotation: allows to view verification results in context of src code. But before using this, we need to load sim results (in FSDB file) using file->load simulation results. Then in hier browser, double click the instance that you want, choose source->goto->line, enter line number and OK. Then choose source->active annotation (or x key after putting the cursor in source code pane) to activate active annotation. values associated withj each signal are than displayed at time 0. Now we can do serach forward, backward for signals to change time.

nSchema
----------
gui that shows schematic.
Once you have imported design, goto tools->new scematic->current scope. Then schematic is drawn for whatever is selected as current scope in panel (current scope name also shows in the top window bar, it's set as whatever instance is selected, i.e digtop or interrupt etc).
In new schematic window, goto view->high contrast. This turns ON contrast for better viewing.
 
nWave
------
gui that shows waveform viewer:
nWave -ssf test1.fsdb => This loads the fsdb file directly
Load fsdb file: do file->open. then type name of dir containing fsdb file in white box. That shows the dir and files in that dir in two windows below. Select appropriate fsdb file in RHS window. click on Add, and then OK. This load the fsdb file.
get signals: click on "get signals" (next to open file drawing)
important settings:
1. Waveform->Snap cursor to transitions. when this is set, then when we click on any signal waveform, then the cursor goes to the next edge. Useful when doing active annotation in debussy, since the change shows up in rtl signal values.
2. Tools->Preferences. It has almost all settings for GUI. Thses settings remain there even on quitting nWave. Goto View Options->Waveform Pane. Check box "Highlight selected signals". This highlights selected signals.
3. To search for signal name, enter it in right hier and right case in "Find Signal). to search all hier, enter * at the end in "Scope", then it searches for everything under that hier. For ex:, if you are in digtop_tb hier, you will see "digtop_tb" in Scope. Just eneter * after that, i.e: /digtop_tb/*
4. To set an alias file fo state machine, etc, first select the signal that you want alias to be set to on the waveform viewer. Then select alias file as: waveform->Signal_value_radix->Add_alias_from_file, then choose the alias file and hit OK. alias file syntax is: states_timergen.alias
ALIAS timergen_sm
 PT_RESET          4'b0000
 PT_XG_INC         4'b0001
ENDALIAS

Verdi: superset of Debussy, as a lot more tools available.
------
    nCompare - Waveform compare (compare rtl and gate level waveforms).
    nSchema - Schematic browser(delay annotation).
    nState - State Diagram Debugger (Displays the Bubble Diagram of state machines)
    n Analyzer - Debug clock tree, clock and reset analysis,view multiple clock domains.
    nEco - Evaluate the changes made on the fly and validate them.
    SVTB - Gives the System Verilog Test Bench Inheritance view, class variables can be viewed synchronously with other signals on nWave.
    Assertion Evaluator - Evaluates System Verilog assertions off line without the simulator.
    Power Manager - Debug the UPF and CPF files and visualize the different power domains in the design
    Temporal Flow Wiew - Brings time,value and hierarchy on the same window


Running Debussy:
--------------------
Dir: /db/Hawkeye/design1p0/HDL/Debussy/

#Before we can run debussy, we need to generate fsdb file and do sdf_annotation (for gate sim) in irun. fsdb generation is not necessary, since debussy can convert vcd into fsdb on the fly. sdf annotation is also not a necessity since we can always run gatesims w/o sdf annotation, but then it's not very useful.

#generate fsdb: add following lines in top level verilog code. (+loadpli option should be used on irun cmd)
#File: /db/Hawkeye/design1p0/HDL/Testbenches/digtop/kagrawal/tb/digtop_tb.v

   initial
     begin
`ifdef FSDB => note FSDB was defined in cmd line of irun, so this section is valid. It generates fsdb which is proprietary.
        $fsdbDumpvars;
        $fsdbDumpfile(`FSDBFILE);

      #5000; //below cmds needed only if do not want dumping for all of sim time. Similar to vcd system task

      $fsdbDumpon; // This starts dumping

     #1000; //Dumps for 1000 time units startig from 5000 time units after sim starts

     $fsdbDumpoff; //this stops dumping
`endif
-----
#NOTE: In $fsdbDumpvars, we can also provide 2 arguments. 1st arg is name of block from which you want to dump fsdb, and 2nd var implies if we just want to dump for this block (1) or for all the hierarchy below it (0).  
ex: the code below dumps fsdb for digtop_tb (only top level since 2nd arg is 1), then dumps fsdb for digtop_00 which is a block within digtop_tb (all levels below it since 2nd arg is 0). The combined fsdb dump is in fsdbfile. So, in nWave, we'll see only digtop_tb. digtop_tb will contain digtop_00 module. digtop_00 module will contain all modules below it.
 $fsdbDumpvars(digtop_tb, 1);
 $fsdbDumpvars(digtop_00, 0);
 $fsdbDumpfile(`FSDBFILE);


-----

`ifdef VCD => if we need VCD (value change dump) which is std waveform database. Can be used with Novas Debussy as it supports both VCD and FSDB. See in verilog.txt for details on these system tasks.
        $dumpvars;
        $dumpfile(`VCDFILE);
`endif
     end // initial begin

SDF annotation: (for gate sims only)
--------------
annotator:
--------
The SDF file is brought into the analysis tool through an annotator. The job of the annotator is to match data in the SDF file with the design description and the timing models. Each region in the design identified in the SDF file must be located and its timing model found. Data in the SDF file for this region must be applied to the appropriate parameters of the timing model. SDF annotation is performed during elaboration, and can only take place at time 0.
2 ways to do sdf annotation:
-----------------------
A. $sdf_annotate utility:
Simulator only read compiled SDF file (sdf_filename.X). SDF src file is provided in $sdf_annotate and then it's compiled by the ncsdfc utility within elaborator to generate sdf_filename.X file, which is used by verilog-XL. Once *.X file is there, it can be used by the simulator for subsequent runs.
for SDF annotation, we need to do same thing as for fsdb/vcd dump file in top level module (digtop_tb). $sdf_annotate can only be in an initial block for verilog code, as it always takes place at time 0 only.

initial begin
      $sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Max_aligned.sdf", digtop_00,,"logs/sdf_max.log", "MAXIMUM"); // 7 args to sdf_annotate = name of sdf file, top level module inst name, cfgfile, logfile, MINIMUM/TYPICAL/MAXIMUM, scale_factor, scale_type.
#for min sdf ann
#$sdf_annotate("/db/DRV9401/design1p1/HDL/FinalFiles/digtop_VDIO_Min_aligned.sdf", digtop_00,,"logs/sdf_min.log", "MINIMUM"); => if sdf_annotate was called in some other module, then we had to specify the full hier, i.e. dut.digtop_00
end

NOTE: after running gatesim with sdf_annotate, look in sdf_max.log to make sure it has no errors or warnings. Else, sdf is not correctly annotated or not annotated at all for such paths. Usually we get warnings like "ncelab: *W,SDFNEP: Unable to annotate to non-existent path ..." => this indicates that an arc was there in verilog model file (i.e in AN210.v), for which there was no corresponding arc found in sdf file. This usually happens with flops, where verilog models of flop (i.e SDC210.v) may have setup and hold arcs separate, while sdf file may have both combined as $setuphold, which may cause this warning. Arcs in sdf file came from .lib file, while sdf annotation is matching the arcs with std cell verilog model file. So, basically every arc in .lib file should match arcs in specify section of verilog model file. Sometimes we have conditional arcs in verilog (i.e arc from S->Y for MUX2). Corresponding arcs in .lib file are written with "sdf_cond : "!A&&B";" etc. "ifnone" arcs in verilog are written with no "sdf_cond" in .lib files. These arcs are written as "CONDELSE" in sdf files. Sometimes, some of these conditional arcs missing in .lib files can cause sdf files to be missing these arcs too. PT/ETS run using .lib files, so they may also have incoorect timing, as timing tools choose arc with worst/best possible timing, so if the missing arc has the worst/best timing, then the timing doesn't reflect that arc, resulting in incorrect timing.
NOTE: when generating sdf file, always use correct options, or some of the arcs might get removed from sdf file even though present in .lib files. One such example is using "CONDELSE" combo path arcs.

For ex: flop in SDC10.v has this in specify section:
     (CLK *> Q  ) = (0.100000:0.100000:0.100000 , 0.100000:0.100000:0.100000);
In SDF, 1st case shown below will pass while second will fail:
IOPATH CLK Q (1.0:1.0:1.0) (0.8:0.8:0.8) => pass
IOPATH (posedge CLK) Q(1.0:1.0:1.0) (0.8:0.8:0.8) => fail, since there no negedge/posedge clause in verilog model

Other warnings:
1. *W,NTCNNC: Non-convergence of negative timing check values in instance I_xyz/reg_5 => -ve timing check couldn't converge. see in verilog.txt for more details
2. *W,SDFNDP: Annotation resulted in a negative delay value or pulse limit to specify path or interconnect delay, setting to 0 => This happens when there are -ve values for delay in sdf file. Since simulator can't go back in time, it has to use 0 or +ve values. So, it sets all these -ve delay values to 0.
3. *W,SDFNEP: Unable to annotate to non-existent path (COND readcond (IOPATH CLK Q[24])) of instance DIG_TOP...U234 of module sshdbw00056025020 <../input/DIG_TOP_routed.fromPT.Min.sdf, line 169701> => This indicates that an arc was found in sdf but not in verilog model file. This usually happens with RAM/ROM IP, which may have intentional blackbox verilog models, which don't have any arcs.
NOTE: any of the above warnings do NOT cause missing annotations, as simulator runs with verilog arcs, and uses the default delay or the sdf delay for that arc. So extra arcs in sdf file are OK. Only when arcs are present in verilog but absent from sdf, is when we see unannotated arcs.

More options for sdf reporting:
1. -sdf_verbose: We can use option "-sdf_verbose" with irun cmd to print more detailed report in sdf.log file. With "-sdf_verbose" option, we'll see each cell instance, and the arcs annotated to it. It will have warnings (*W,SDFNEP) if while annotating a cell from sdf file, it's not able to find corresponding arc in verilog model file. Once all the cell arc annotation is done, we'll see "ABSOLUTE PORT:" delays, which show interconnect delay for getting to an i/p pin of each instance. This is taken from the "INTERCONNECT" delay section of sdf file. The reason, we only see i/p pins of cells and NOT the o/p pins is because interconnect delay is just needed for each i/p pin to form the full path. That is also the reason, why interconnect delays are not specified b/w 2 points (o/p of one gate to i/p of other gate), as it's not needed.
2. -sdfstats: If we want to have more sdf stats for unannotated arcs, we can run irun with options "-sdf_verbose -clean -sdfstats sdf_unannotated.txt". Then it shows a list of unannotated arcs with their corresponding cells. Arcs that are in verilog model, but not in sdf are the arcs that are left unannotated (and shows up as less than 100% annotation). In that case, simulator takes the default delay of such arcs from the verilog model file.

B. Cmd file:
Instead of using annotator cmd ($sdf_annotate), we can do sdf annotation using these 3 steps:
1. generate compiled sdf file using this cmd on the unix shell:
ncsdfc SPI.sdf -output SPI.blah => generates SPI.sdf.X in the current dir if no output file specified with -output.
2. wrt sdf cmd file: There are seven statements, which correspond to the seven arguments of the $sdf_annotate system task. Only one statement is required: the COMPILED_SDF_FILE statement, which specifies the compiled SDF file that you want to use. Others are optional (create cmd file named:  myfile.sdf_cmd) Note, file has to be terminated with a ;
COMPILED_SDF_FILE = digtop_func_W_125_1.62.sdf.X,
SCOPE = :pm7324_inst, => annotate to the VHDL scope :pm7324_inst, which may contain Verilog blocks. For us, it's :UUT or tb_digtop.dut.
LOG_FILE = "pm7324_flat.sdf.log", =>log
MTM_CONTROL = "TYPICAL", => min/typ/max. Indicates which triplet will be used.
SCALE_FACTORS = "1.0:1.0:1.0", => optional. mult factor for min/typ/max
SCALE_TYPE = "FROM_MTM"; => optional. scales timing specs FROM_MINIMUM/FROM_TYPICAL/FROM_MAXIMUM/FROM_MTM. i.e it indicates which of the 3 triplets will be used. For ex: if MTM_CONTROL = "TYPICAL", then we specify SCALE_TYPE = "FROM_TYPICAL".
3. #for ncelab, use ncelab -sdf_cmd_file filename option to include the SDF command file.
ncelab -sdf_cmd_file myfile.sdf_cmd worklib.top
#For irun, we can use the same option: irun .... -sdf_cmd_file myfile.sdf_cmd -sdf_verbose ...

When running irun, we see annotation message like this:
     Reading SDF file from location "/vobs/.../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf"
     Writing compiled SDF file to "/sim/.../../digtop_func_QC_NOM_1.8_ATD-N_25_1.8-.sdf.X".
    Annotating SDF timing data:  ....    
    Annotation completed successfully...
    SDF statistics: No. of Pathdelays = 29695  Annotated = 100.00% -- No. of Tchecks = 38702  Annotated = 99.99% => Path_delays/Tchecks refer to ones in verilog model for cells, while Annotated refer to ones in sdf
                        Total        Annotated      Percentage
         Path Delays           29695           29695          100.00 => path delays refer to IOPATH in cell, and not to interconnect delay. Here verilog model IOPATH(under Total) for all cells match sdf IOPATH(under Annotated). Reason for mismatch would be when there's an extra gate in netlist but not in sdf file
             $period               2               2          100.00
              $width            6942            6942          100.00
             $recrem            4506            4506          100.00
          $setuphold           27252           27250           99.99 => 2 setuphold arc in verilog for which the annotator didn't find corresponding arc or timing in sdf. This needs to be fixed as they should match exactly at 100%.
NOTE: missing interconenct delays will be reported separately as "ncelab: *W,SDFINC: interconnect ... not connected to ..."

NOTE: If we provide non-existent sdf file in $sdf_annotate, then irun doesn't give any warnings. We don't see any annotation messages as shown above. Instead delays from verilog models (ex 0.01ns for gates when TI_verilog is defined) are taken, and annotation is done using those delays. As a result, we may see tons of timing violations for cells. Best way to find out is to pull up waveform and check delay for buffers/inverters and make sure they match those from sdf files.

------------
SDF file format is below in another section.

----------------
#Then we run Ncverilog or irun with loadpli1 (pointing to verdi PLI), and we get waveform dump. Then we start running debussy in separate dir to debug this waveform.

script: create_symbols for debussy/verdi:
-----------------------
creating symbols: Debussy/Verdi can display gate-level schematics using the proper symbols for the cells used in the netlist.  To enable this, you must set up a Debussy/verdi symbol library for the target cell library.  The symbol library can be created by running the utility syn2SymDB on the equivalent Synopsys Liberty (.lib) library.
syn2SymDB -o foo_u foo.lib foo1.lib =>
     -o:  Specifies output library name
      foo.lib:  Synopsys library name. Other lib can be added separating them by space
 This creates symbol library (directory) called foo_u.lib++.

NOTE: we can also run "vericom" compiler by synopsys to generate foo_u.lib++
cmd: vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => reads rtl files to generate VerdiLib.lib++

ex: just typing syn2SymDB may not work, so type the whole path
/apps/novas/debussy/2010.04/platform/LINUXAMD64/bin/syn2SymDB -o symbol \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CORE.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_CTS.lib \
/db/pdk/lbc8lv/current/diglib/msl445/current/synopsys/src/MSL445_N_25_1.8_ECO.lib
=> creates symbol.lib++ dir.

You must reference this symbol library by setting the following two environment variables:
     setenv TURBO_LIBS "foo_u"
     setenv TURBO_LIBPATHS <path to the directory containing the symbol library directory>

We can also include these 2 variables in novas.rc file as:
   TurboLibs = symbol
   TurboLibPaths = /data/VIKING_OA_DS3/a0783809/debussy/lib
=> novas.rc gets loaded anytime debussy is invoked, so it looks in "lib" dir for "symbol.lib++" and adds all those symbols.

#Invoke Debussy and compile/load your netlist.
debussy -2012.04 /data/.../DIG_TOP_routed.v => This loads PnR netlist so that we can see schematic of this. (2012 version shows old gui, while later ones show new gui)
verdi /data/.../DIG_TOP_routed.v -upf2.0 Top.upf -upftop digtop => Loads PnR netlist into verdi (-upf loads upf to show various power domains in design. If loading upf, top module name for upf needs to be provided)


debussy quick tips:
------------
0. clicking on the AND gate symbol (2nd row 3rd col on gui) brings up the schematic.
1. When tracing loads, click on any net and click "Trace Load". Then from top, do tools->New Schematic->From Trace Results. This brings a new window which only shows net and all loads. This is helpful to see all loads on any net.
2. click Schematic->Find (or Caps A), and put name of nets/instance and it will show all. Select one that you need and click "c" to change color of that net.

script: run_debussy_rtl/run_debussy_gate: for gate runs and rtl runs
----------------------
run_debussy_rtl:

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy. NOTE: we don't really need these symbols since rtl only has clk gaters instantiated from library, so those will show as square box.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_rtl.f -vtop vtop.map -2001 -autoalias & => we can also just use "debusssy &"

list_rtl.f
-------
-f /db/DRV9401/design1p1/HDL/Source/digtop_rtl.f => has paths to all rtl files from source area: /db/DRV9401/design1p1/HDL/Source/digtop.v, global.v, etc
/db/DRV9401/design1p1/HDL/Testbenches/kagrawal/digtop/tb/digtop_tb.v => has path to top level tb block

run_verdi_rtl:

----------------

#invoke verdi to load RTL

vericom -2013.09 -sv -f list_rtl.f -lib VerdiLib => invoke vericom to create VerdiLib.lib++ from rtl files. (for some reason, this gives lots of error whe reading verilog packages. options "-2012 -ssv -ssy" seem to resolve all these errors. -2012 enables system verilog constructs (probably same as -sv), while "-ssv -ssy" enables verdi database for library cells.

verdi -lib VerdiLib -top digtop => Here we are loading VerdiLib.lib++, no need to specify RTL files, as lib++ already has lib built from rtl from earlier step (when running vericom)

verdi -f list_rtl.f => This loads list_rtl.f directly instead of generating lib thru vericom. For some reason, this gives lots of errors with packages.

vtop.map: debussy accesses already dumped fsdb files. The map file maps hier in fsdb to that in RTL.
-------
digtop = digtop_tb.digtop_00 => this provides the hier path to the dut (digtop_00 is the instance name of digtop [digtop is top level RTL module] instantiated in digtop_tb)

run_debussy_gate: same as with rtl except that we run it directly on gate netlist:
---------------
list_gate.f
/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v => path to gate level netlist
../Testbenches/digtop/tb/digtop_tb.v => path to top level tb

vtop.map
-------
digtop_top = digtop_tb.digtop_00 (If top level module in gate netlist is called digtop_top, then that is what we specify. This digtop_top ties gate level netlist with tb file)

NOTE: If we get *.vcd file from analog team, then to run debussy, we need to map the hier from .vcd file to our gate netlist. so, in vtop.map:
digtop_top =  zorro_toplevel_sch.I3.I7.I0 (here I0 at the end refers to the inst of "digtop_top" module in gate level netlist digtop_VDIO.v. zorro_toplevel_sch is the schematic name within which we have I3 top level block, which contains digital wrapper I3 within which we have digital block I0)

running debussy when debugging RTL:
----------------------------------
Bring up Debussy nTrace. goto source-> mark Parameter annotation and active annotation.
Now, open nWave by going to Tools->New Waveform.
Now, we can drag and drop signals from nTrace to nWave and vice vera, and observe signals.
1. We can click on clk edge in nWave and that will show which values changed.
2. We can click on signal names in nTrace and it will backtrace it.
2. We can click "c" on any net, and we can set net to chosen color.
3. We can open 2 nWave from nTrace by going to Tools->New Waveform. this way, we open 2 nWave window. we can goto nWave "window" button and turn ON sync waveform view. We do it for both the windows so that clicking on cursor in any one of them, will affect the other (if we do it for only one of them, then clicking on cursor in that window will affect the other, but not the other way around). Then the 2 nWave windows will be synced in time, so that it's easier to compare results (for ex b/w RTL and gate)
4. NOTE: when we open nWave using Debussy and do active annotation, we will see the name of the fsdb file on the top panel of debussy window. That is the fsdb file that is actively annotated with the current RTL that we see in the RHS of debussy main window. If we open open any other nWave window and any other fsdb file, it will NOT be actively annotated with that RTL. To actively annotate other fsdb file, we goto nWave window of new fsdb, click on Window->change to primary. This changes this new nWave window to be actively annotated with the current RTL (we will see the name of this new fsdb file on the top panel of debussy window). So, we can switch back and forth b/w multiple nWave window.
4. NOTE: sometimes when we load new fsdb from nWave window, it may not get annotated properly with rtl. So, best way to open a new fsdb is to do it from Debussy panel. In debussy, goto File->close simulation results. this kills the current fsdb, but retains all the signals, so that we don't have to save it. Now, do File->load simulation results and open the new fsdb. This is correct way to view new fsdb.

running debussy when looking at gate netlist for ECO:
----------------------------------------------------
run_debussy_eco: here we are just looking at schematic of gate netlist, so we invoke debussy with just gate level netlist.

#in our dir, we see PML30.lib++. So, we set these var as follows and invoke debussy.
setenv TURBO_LIBS PML30
setenv TURBO_LIBPATHS /db/Hawkeye/design1p0/HDL/Debussy/
debussy -f list_gate.f & => list_gate.f has path to gate level netlist /db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v
#debussy & => If we call debussy w/o -f option, then we have to do File->Import design, Put the file name (/db/DRV9401/design1p0/HDL/FinalFiles/digtop_VDIO.v) in bottom box, and then click Add, then OK.

Then click on Tools->New Schematic->Current scope

patgen files:
-----------
Ex: /db/MOTGEMINI_DS/design1p0/HDL/Testbenches/digtop/patgen

verilog models used: (for lbc7)
--------------------
A. MODEL_functiononly: timescale is 1ps/1ps. It has following delays specified:
 1. gates (AN2,etc) = 0
 2. clk gating cells (CG*) = 0
 3. flops, c2q delay = 1ps. For ex: in DTP20.v (in lbc7), "buf" and "not" gates are specified delay of #1(1ps), so final o/p Q/QZ have delay of 1ps.

B. MODEL_verilog: timescale is 1ns/1ps. It has following delays specified:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). Also does setup/hold checks.

C. If nothing defined (neither MODEL_functiononly nor MODEL_verilog). timescale is 1ns/1ps. same as MODEL_verilog except no checks done:
 1. gates = 10ps(#0.01) (in specify section =>)
 2. clk gating cells (CG*) = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.
 3. flops, c2q delay = 100ps(#0.1) (in specify section =>). NO setup/hold checks done.

NOTE: we used "specify" instead of putting delays as "#" so that when we sdf annotation, it will disregard delays in specify section. If we hardcoded delays as #, then we would have double counted the delay as sdf annotation would have happened on top of existing delay in verilog model.
NOTE: Only delay numbers are disregarded in specify section, all arcs (c2q, setup/hold, rec/rem, width, etc) are still honored and transfer is passed to appr notifier in the verilog model (using delay numbers from sdf file).

NOTE: When we run AMS sims, we run toplevel sims directly on digital schematic which is a gate level netlist. We don't have sdf file to annotate delays for gates. So, we set "MODEL_functiononly" as that will cause no setup/hold issues. Flops will always have 1ps delay, so they will always have enough setup time, and since all comb gates on clk have "0" delay, there will be no hold issues.  If we run it with MODEL_verilog (or with nothing defined), then hold issues may show up, which may not be actually present in ckt. This hold will show up if we've (c2q+data_path_delay < clk_path_delay). Usually clk+path has < 10 clk buffers, so no hold issues. However, if even 1 clk gating cell gets added on clk path, then hold will get violated as clk will change before data (if no of clk buffers is greater than no of gates in data path) or at same time as data (if no of clk buffers is same as no of gates in data path).

modeling delays in simulations:
------------------------------
By default, verilog gate level models, and, interconnect delays are always simulated as transport delays, but they look as if they are simulated as pure inertial delays (since they don't allow glitches shorter than prop delay to pass thru). This is beacuse, by default, pulse_r and pulse_e are set to 100%. These are verilog cmd line switches that can be used to alter this behaviour for gate level sims. delays inside of specify blocks are affected, when this cmd line switches are used with simulators: (add +transport_path_delays also)
A. +pulse_r/R% : switch forces pulses that are shorter than R% (R=0 to 100) of the propagation delay of the device being tested to be "rejected" or ignored.
B. +pulse_e/E% : switch forces pulses that are shorter than E% (E=0 to 100) but longer than %R of the propagation delay of the device being tested to be an "error" causing unknowns (X's) to be driven onto the output of the device. Any pulse greater than E% of the propagation delay of the device being tested will propagate to the output of the device as a delayed version of the expected output value.
scenarios are as below:
  0% ------  R%  -------   E% ------  100%
 --- reject  --  error(x)  -- output  --- => So, glitches can be rejected, output an x or get out as normal delayed version depending on settings.

Ex: vcs -RI +v2k tb.v delaybuf.v +pulse_r/0 +pulse_e/0 +transport_path_delays => causes pulses shorter than 0% to be rejected, and pulses greater than 0% to be propagated to the o/p. => all pulses are passed, no matter how small.
Ex: +pulse_r/0 +pulse_e/100 => causes no glitches to be rejected, but o/p x, for glitches shorter than propagation delay.
Ex: +pulse_r/100 +pulse_e/100 => models inertial delays, where all pulses shorter than propagation delay are ignored.
Ex: +pulse_r/20 +pulse_e/20 => causes  glitches <20% to be rejected, but glitches >20% to be passed.

NOTE: when we run gate sim, we may start seeing "glitch suppression" warnings (many times after adding pulse_r/pulse_e switches).
EX: Warning!  Glitch suppression
           Scheduled event for delayed signal of net "GVC_D_D" at time 1027453294 PS was canceled!
            File: /db/pdkoa/lbc8lv/current/diglib/msl458/PAL/CORE/verilog/SDP10B_LL.v, line = 92
           Scope: tb_digtop.dut.I_i2c_top.I_bellatrix_i2c_slave.I_meson_i2c_fsm.bitCnt_reg_2
            Time: 1027453096 PS

Glitch suppression: This happens when there are -ve timing values, which causes simulator to use delayed signals. When a delay with two values is calculated, there is the possibility that an event on the input net may cancel a scheduled event on the internal signal driven by the delay. This is called glitch suppression.Because  glitch  suppression  can  hide  input  events  from  a  timing  check's  input,  the simulator generates a glitch suppression timing violation if an event on a delayed signal is canceled.
To suppress the warnings due to the glitch suppression algorithm, use the -nontcglitch simulation option  

NOTE: the above cmd line switches only valid for delays in specify block, not for delays using SDF annotation. For sdf delays, we need to have these in absolute numbers within sdf file for each cell.
NOTE: to specify reject/error, we need to have extra paranthesis, like this:
ex: (IOPATH A Y ((rise_delay) (rise_reject) (rise_error)) ((fall_delay) (fall_reject) (fall_error)) ) => extra parantheses, empty parantheses for reject/error imply that reject/error is set equal to delay value => inertial delay model
ex: (IOPATH A Y (rise_delay) (fall_delay)) => no extra brackets, so values are delay values. no reject/error values.
ex:
(CELL
  (CELLTYPE "IV110")
  (INSTANCE U32)
  (DELAY
    (ABSOLUTE
    (IOPATH A Y ((0.066:0.066:0.066) (0.015:0.015:0.015) (0.019:0.019:0.019)) ((0.059:0.059:0.059) (0.012:0.012:0.012) (0.017:0.017:0.017))) => 66ps for o/p rise delay, 15ps is rise reject limit while 19ps is rise error limit. 59ps for o/p fall delay, 12ps is fall reject limit while 17ps is fall error limit.
    )
  )
)
Ex: we can also use "PATHPULSEPERCENT" keyword in sdf file to specify reject and error limits in % terms.
    (IOPATH A Y (0.066:0.066:0.066) (0.059:0.059:0.059))
    (PATHPULSEPERCENT A Y (25) (35)) => 25=pulse reject limit in %, 35=pulse error limit in %
-----------------------

SDF file syntax: ( /db/Hawkeye/design1p0/HDL/Primetime/digtop/sdf/digtop_max.pt.sdf)
-----------------
OVI (open verilog intl) developed SDF v3 syntax. timing calc tools (PT,etc) are resp for generating SDF.

syntax:
------
(DELAYFILE
(SDFVERSION "OVI 3.0")
(DESIGN "digtop")
(DATE "Thu Jul 21 20:22:34 2011")
(VENDOR "PML30_W_150_1.65_CORE.db PML30_W_150_1.65_CTS.db")
(PROGRAM "Synopsys PrimeTime")
(VERSION "D-2010.06")
(DIVIDER /) => hier divider is / (by default, it's .) a/b/c
// OPERATING CONDITION "W_150_1.65" => // is for comment
//triplets are always in form - min:typ:max for delay
(VOLTAGE 1.65:1.65:1.65) => best:nom:worst
(PROCESS "3.000:3.000:3.000") => best:nom:worst
(TEMPERATURE 150.00:150.00:150.00) => best:nom:worst
(TIMESCALE 1ns) => implies all delay values are to be multiplied by 1ns
//delays specified in CELLS for both interconnect and cell delay.
//interconnect delays => we may have the block below repeated many times as only some wires may be in each block. It's easier for readability. interconnect delays are always between 2 points => o/p of one gate to i/p of other gate.
(CELL => inter connect delays specified here. interconnect delays of order of ps (vry small), while cell delays of order of ns. All INTERCONNECT delays are only specified for top level module (digtop). For wires which are not in digtop, heir names are used.
  (CELLTYPE "digtop")
  (INSTANCE) //no instance specified, implying it's interconnect delay
  (DELAY =>
    (ABSOLUTE => delay can be ABSOLUTE or INCREMENT
    (INTERCONNECT scan_out_iso/U282/Y em_out_31_I_bufx4/A (0.008:0.008:0.008) (0.008:0.008:0.008)) //rise/fall (min:typ:max)delays. min:typ:max are same delays for one sdf file as we use separate sdf files for min/typ/max corners. However, if we use newer tools as tempus to generate sdf, we may see (0.41::0.62), which indicates  that for sdf generated at particular corner (say NOM.sdf), we may have different values for min,typ,max. In timing tools for OCV runs, for a giver corner (say NOM), min value in triplet is used for clk, max for data (for setup check) and viceversa for hold. However for gatesim, it takes only one value for all paths, and we specify what triplet value to use (by stating "MAXIMUM","TYPICAL" or "MINIMUM" in sdf_annotate). So, ideally, we should run gate sims with "MAX" triplet  with QC_MAX.sdf, and "MIN" triplet with QC_MIN.sdf. "MAX" and "MIN" triplet with QC_NOM.sdf is not really needed as it will be bounded by MAX/QC_MAX.sdf and MIN/QC_MIN.sdf.
    (INTERCONNECT scan_out_iso/U164/Y a2d_trg_out_I_bufx8/A (0.001:0.001:0.001)) //same delay for rise/fall (NOTE: hier names used)
    ...
    )
  )
)   
//cell delays
(CELL => delay for cells: delays for each instance defined separately, since it may be diff based on load.
  (CELLTYPE "NA210") =>nand gate
  (INSTANCE test_mode_dmux/U85) => in test_mode_dumx module. since instance specified, it's cell delay
  (DELAY
    (ABSOLUTE
    (IOPATH A Y (0.129:0.129:0.129) (0.170:0.170:0.170)) //rise/fall for Y (min:typ:max)delays. We don't specify rise/fall for A as it's automatically decided based on direction of Y.
    (IOPATH B Y (0.157:0.157:0.157) (0.158:0.158:0.158))
    (COND !A&&!B (IOPATH Y S  (0.630::0.641) (1.470::1.476))) //some complex cells(adders, etc) will have cond delay arcs.
    )
  )
)
(CELL => flop delay. flops will have delay arcs as well as timing check arcs.
  (CELLTYPE "TDC10")
  (INSTANCE spi/data_reg_15)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK QZ (0.622:0.622:0.622) (0.624:0.624:0.624)) => NOTE: sdf doesn't say rise/fall of CLK in IOPATH. Only rise/fall of QZ. However, model file specifies QZ delay wrt posedge or negedge CLK. So, there's always this discrepancy b/w verilog model file and sdf file for all IOPATH.
    (IOPATH CLK Q (1.308:1.308:1.308) (0.874:0.874:0.874))
    (IOPATH CLRZ QZ (0.936:0.936:0.936) ())
    (IOPATH CLRZ Q () (1.217:1.217:1.217))
    )
  )
  (TIMINGCHECK => checks
    (WIDTH (posedge CLK) (0.176:0.176:0.176)) => min allowable time for +ve(high) pulse of clk
    (WIDTH (negedge CLK) (0.692:0.692:0.692)) =>  min allowable time for -ve(low) pulse of clk
    (WIDTH (negedge CLRZ) (0.330:0.330:0.330)) => min allowable time for -ve(low) pulse of clrz
    (SETUPHOLD (posedge D) (posedge CLK) (0.437:0.437:0.437) (-0.263:-0.263:-0.263)) => setup and hold checks for rising edge of D wrt +ve clk. first triplet(0.437) is for setup, while second(-0.263) is for hold. triplets are min:typ:max delays. SETUP and HOLD can also be separated by using SETUP and HOLD keywords. NOTE: setu is +ve, while hold is -ve (typically true for flops as data lines inside flops have extra gates before they hit clk logic)
    (SETUPHOLD (negedge D) (posedge CLK) (0.716:0.716:0.716) (-0.288:-0.288:-0.288)) => similarly for falling edge of D
    (SETUPHOLD (posedge SCAN) (posedge CLK) (0.954:0.954:0.954) (-0.592:-0.592:-0.592))
    (SETUPHOLD (negedge SCAN) (posedge CLK) (0.659:0.659:0.659) (-0.538:-0.538:-0.538))
    (SETUPHOLD (posedge SD) (posedge CLK) (0.472:0.472:0.472) (-0.317:-0.317:-0.317))
    (SETUPHOLD (negedge SD) (posedge CLK) (0.756:0.756:0.756) (-0.332:-0.332:-0.332))
    (RECREM (posedge CLRZ) (posedge CLK) (0.405:0.405:0.405) (0.084:0.084:0.084)) => recovery check  is like setup check for clrz where it should go inactive sometime before the clk., so that flop i/p can get flopped. Removal check is like hold check for clrz where it should go inactive sometime after the clk, so that flop i/p doesn't get flopped that cycle, but the next cycle. RECREM combines RECOVERY ans REMOVAL checks in one. 1st triplet(0.405) is recovery, 2nd(0.084) is removal.
  )
)
(CELL => clkgater delay
  (CELLTYPE "CGPT40")
  (INSTANCE hwk_regs/clk_gate_ccd_brightness_out_reg/latch)
  (DELAY
    (ABSOLUTE
    (IOPATH CLK GCLK (0.642:0.642:0.642) (0.610:0.610:0.610))
    )
  )
  (TIMINGCHECK
    (WIDTH (negedge CLK) (0.538:0.538:0.538))
    (SETUPHOLD (posedge TE) (posedge CLK) (0.701:0.701:0.701) (-0.448:-0.448:-0.448))
    (SETUPHOLD (negedge TE) (posedge CLK) (0.795:0.795:0.795) (-0.508:-0.508:-0.508))
    (SETUPHOLD (posedge EN) (posedge CLK) (0.468:0.468:0.468) (-0.214:-0.214:-0.214))
    (SETUPHOLD (negedge EN) (posedge CLK) (0.721:0.721:0.721) (-0.430:-0.430:-0.430))
  )
)
(CELL => latch delay
  (CELLTYPE "LAH11")
  (INSTANCE flipper_top/flipper_ram/flipper_ram_reg_185_5)
  (DELAY
    (ABSOLUTE
    (IOPATH C Q (0.826:0.826:0.826) (1.097:1.097:1.097))
    (IOPATH D Q (0.622:0.622:0.622) (1.250:1.250:1.250))
    )
  )
  (TIMINGCHECK
    (WIDTH (posedge C) (0.733:0.733:0.733))
    (SETUPHOLD (posedge D) (negedge C) (0.464:0.464:0.464) (-0.339:-0.339:-0.339))
    (SETUPHOLD (negedge D) (negedge C) (1.116:1.116:1.116) (-1.079:-1.079:-1.079))
  )
)

(CELL //hard IP
    (CELLTYPE  "ophdll00032008040") => otp
    (INSTANCE  I_i2c_top/I_bellatrix_i2c_otp/I_otp_32x8)
      (DELAY
    (ABSOLUTE
    (IOPATH CLK Q[0]  (27.7495::27.7495) (7.7705::7.7705)) ...
    (IOPATH CLK Q[7]  (27.7482::27.7482) (7.7696::7.7696))
    (COND WRITECOND (IOPATH CLK BUSY  () (18.6512::18.9356)))
    (COND READCOND (IOPATH CLK BUSY  (5.5594::5.5594) (73.8027::73.8027)))
    (IOPATH PROG BUSY  (17.0202::17.2431) (18.7960::19.0702))
    )
      )
      (TIMINGCHECK
    (SETUPHOLD (posedge READ) (posedge CLK) (23.4053::23.4053) (4.3649::4.3649)) ...
    (WIDTH (COND WRITECOND (posedge CLK)) (50000.0000::50000.0000)) ... => This WRITECOND should be there in verilog model of otp else tool will complain about missing "WRITECOND". This "WRITECOND" initially came from .lib file.
    (PERIOD (COND WRITECOND (posedge CLK)) (50202.0000::50202.0000)) ...
    (SETUPHOLD (posedge D[0]) (posedge CLK) (0.1406::0.1406) (5.4447::5.4447)) ...
    (SETUPHOLD (negedge A[4]) (posedge CLK) (0.7298::0.7298) (4.0518::4.0518))
    (SETUPHOLD (negedge PROG) (posedge READ) (163.5221::163.5221) ())
    (SETUPHOLD (negedge CLK) (posedge READ) (163.5160::163.5160) ())
      )
)

-------

SDF supports both a pin-to-pin and a distributed delay modeling style. We use pin to pin.
SDF supports setup, hold, recovery, removal, maximum skew, minimum pulse width, minimum period and no-change timing checks.
interconnect delay: SDF supports two styles of interconnect delay modeling.
A. The SDF INTERCONNECT construct allows interconnect delays to be specified on a point-to-point basis from o/p port of one device to i/p port of other device. This is the most general method of specifying interconnect delay.
B. The SDF PORT construct allows interconnect delays to be specified as equivalent delays occurring at cell input ports. This results in no loss of generality for wires/nets that have only one driver. However, for nets with more than one driver, it will not be possible to represent the exact delay.

cell delay: SDF supports 2 types of cell delay.
A. IOPATH implies delay from i/p port of device to o/p port of same device. We use this for all simple cells.
B. COND implies conditional i/p to o/p path delay. We use this for complex cells (adders, etc).

************************************************

Makefile:


make utility in unix is an interpretor for Makefile. Makefile is like a shell script (similar to test.csh, run.bash, etc). The only difference is that Makefile is not an executable (not sure why it's not required to be executable, as it may have unix cmds in it, and can be run by anyone). We write the script in a file called Makefile (note capital M in Makefile). Then we run the interpretor called "make", which interprets this Makefile (Makefile is the default file make looks for, we can also specify other files for make to look at) and produces desired outcome. make uses rules in Makefile to run (Makefile is placed in same dir as where make is run). It is very important utility in Linux, as many programs/applications use Makefile to generate executable files. If you want to write and compile your own large program, Makefile is essential there too. Makefile is basically a file which says what actions to take, depending on what dependencies it has. Makefile was written because some programmer forget to recompile a file that had changed. that caused him many hours of wasted time. make was written so that it could keep track of what changed, and recompile the needed files automatically, w/o user bothered with it. We can entirely do away with Makefile if we can manage everything manually.

Makefile is used extensively in generating executables for programs as C. It's also used as wrapper for calling multiple bash/csh scripts via just 1 cmd.

Very good and short intro to Makefile is here:

http://www.jfranken.de/homepages/johannes/vortraege/make_inhalt.en.html

Authentic make documentation from GNU: http://www.gnu.org/software/make/manual/make.html

Makefile Syntax:

Makefile has separate lines. Each line ends with a newline (no ; needed unless we put multiple cmds on same line). If we want to continue current line to next line, we can use "\" at end of line (before entering newline). That way "make" sees next line as continuation of previous line. General syntax of Makefile is ( [ ] implies it's optional):

target [ more targets] :[:] [ prerequisites ] [; commands] => If we want to put cmds along with prerequisites, then we need to have ; to separate cmds from prerequisites.

[ <tab> commands ] => Note that there needs to be a tab (multiple spaces) before commands in every line.

[ <tab> commands ]

...

Makefile consists of 4 things:
1. comments: start with #
2. defn of variables/functions: myvar = adb, or myvar := adb (spaces don't matter. i.e myvar=adb is fine too). Now myvar can be accessed anywhere in Makefile by using $(myvar). We should always use $(myvar) instead of $myvar, as myvar may not be treated as single var when not nside (), causing $myvar to be expanded as $m followed by yvar. We can apply var to subset of string. i.e: $(myvar)_module/ABC/X will expnad to adb_module/ABC/X. Curly braces {} can be used too. ". =" expands var/func at use time, while ":=" expands them at declaration time.

myvar ?= adb => ?= is a conditional assignment and assigns a value to var only if that var hasn't been defined previously.

    ex:  XTERM_TITLE = "$@: `pwd`"

    ex: ARGS = -block $(DES) $(SCR)/tmp/a.tx -name kail


3. Includes: to include other Makefile, since 1 Makefile may get too big. When "-" is placed in front of any cmd, it ignores that cmd in case of errors and moves on. If - not placed before a cmd, and that cmd fails, then make aborts. -include Makefile.local will execute Makefile.local, and if Makefile.local is mssing, then it will keep on mocing (w/o aborting)

4. Rules: hello: a; @echo hello

5. other cmds: all unix cmds that can be used in shell, can be used in Makefile. (ex: echo, for ... do .. done, export, etc). These cmds can be put directly on action/cmd line of rules.

ex: @rm -rf $(DES) => removes files. @-rm -rf * => - in front causes the cmd to be ignored if there's any error executing the cmd, and make moves forward with next line.

Rules: We will talk about rules, since they are the heart of Makefile:
ex: below ex defines rules for target hello & diskfree.  Rules have multiple lines. 1st line is rule or dependency (or prerequisite) line, 2nd line is action line. 1st line says that the prerequisite has to be satisfied before 2nd line can be run. It will check to see if the prerequisite is upto date based on it's own dependencies, if so it will run action line, else it will run prerequisite to make it upto date based on it's dependencies. (1st variable on the rule line (ie "hello" in ex below) is the name of the target that can be specified on unix cmd line as "make hello")
hello: ; => dependency line (or pre-requisite line): It's blank here as we don't have any dependency. We can put a ; at end of line if we want to put next cmd on this line itself, otherwise it's not needed.
       @echo Hello => action line: running "make hello" outputs "Hello" on screen. @ prevents make from announcing the command it's going to do. So, @ prevents the cmd "echo Hello" from getting printed on screen. Since echo is already printing "Hello" on screen, we do not want 2 lines to be printed on screen (i.e "echo Hello" followed by "Hello"). That's why we put a @
diskfree: ;
          df -h => running "make diskfree" runs "df -h" which outputs diskfile usage on screen. Since @ is not used, it outputs the cmd "df -h" on screen

#create a Makefile and copy the above 2 lines in it. Then run 2 cmds below on cmd line in shell. If we don't tell make a target, it will simply do the first rule:
make => will do target hello, resulting in "Hello" on screen.
make diskfree hello => it will do these targets in order, first diskfree, then hello.

Make options:

There are various options that can be used when running make. They can be found on gnu link above. Some of the imp ones are below:

#above we did not specify which makefile to use.  It uses Makefile in current dir. To be specific, we say:
make -f makefile.cvc => runs make on this makefile.cvc. no target specified, so does the first rule, which is "cvc" which makes binary for cvc. To run clean, do:
make -f makefile.cvc clean => runs rule "clean" for this Makefile.

make -C dir1/dr2 all => -C specifies the dir where Makefile is. all is a convention. "all" rule is defined which runs sub targets to build the entire project. Since usually we want to run "all", we put it as first target, so that just running "make" will run "make all".

make hello -n => -n option shows what all steps will be run for target "hello" without actually running the steps. This helps us in understanding the sequence of steps that will be run, when analyzing a Makeful. This is very useful in debug and used a lot. Always run any target with "-n" option to check what it's going to do, and then run it without "-n" to run it.

pattern matching : % can be used match multiple targets. ‘%’ which acts as a wildcard, matches any number of any characters within a word. Ex:

final_%_run: %.out ;

           @echo "Hell" => Now, when we run "make final_2ab_run", then this rule gets run, as target matches name in Makefile with % matching "2ab". It has a dep 2ab.out (since wildcard % is assigned 2ab). We cannot use % in action line, as it's not substituted with "2ab". If we run "make final_abc_run", then again the same target gets run, but now % is replaced by abc. So, dep is now abc.out. NOTE: when we use % notation, then "make" w/o any target will error out, as there's no default matching target.

Phony targets:

.PHONY: By default, Makefile targets are "file targets" - they are used to build files from other files. Make assumes its target is a file. i.e "hello: abc ;echo .." implies hello and abc are files (hello and abc files are generated via cmds in Makefile). It looks at timestamp of hello and abc files to decide what to do. However, many tagets such as "clean", "all", "install" are not files, so if there is a file with same name, make will start looking at timestamp of such files to determine what to do.  The .PHONY directive tells make which target names have nothing to do with potentially existing files of the same name. PHONY implies these targets are not real.  .PHONY target implies target that is always out of date and always runs ignoring any time stamps on file names. ex: .PHONY setup => this will cause setup to be run irrespective of the state of file "setup" if any.

automatic variables: On top of var defined by user, we also have in built var:

  •  $@ in action line suubstitutes it with target name
    ex: hello: ;
               @echo printmsg $@ => prints "printmsg hello" on screen.
     
  • $< in action line substitutes it with name of 1st pre-requisite in dep line. To get names of all prereq with spaces in b/w them, use $^ or $+ ($^ removes duplicate prereq, while $+ retains all of them).
    ex: tiger.pdf: tiger.ps; ps2pdf $< => make tiger.pdf will run this cmd: ps2pdf tiger.ps

 



------
1. Example of Makefile and make:

ex: executable "sum" is to be generated from 2 C files (main.c,sum.c) and 1 h file sum.h, which is included in both c files.
run make => reads Makefile, creates dependency tree and takes necessary action.


#binary exe
sum: main.o sum.o => dependency line states that exe sum depends on main.o and sum.o
      cc -o sum main.o sum.o => cmd/action line states how pgm should be compiled if 1 or more .o files have changed.

#main dep
main.o: main.c sum.h => dep line for main.o. we can omit main.c, since built in rule for make says that .o file depends on corresponding .c file.
        cc -c main.c => cmd line stating how to generate main.o

#sum dep
sum.o: sum.c sum.h => similarly we can omit sum.c from here
       cc -c sum.c

#above 2 dep, main & sum can be combined into 1:
main.o sum.o: sum.h     => means both .o files depend on sum.h (dep on main.c and sum.c is implied automatically)
              cc -c $*.c => macro $*.c expands to main.c for main.o and sum.c for sum.o



2. Example of Makefile and make:

ex: executable for arm processor:

Makefile: /data/tmp/Makefile
run:  make TGT=hellow

#define variables doe compiler, assembler, linker and elf
tool-rev        =       -4.0-821
CC              =       armcc $(tool-rev)

#compiler options
CCFLAGS         =       $(CPUTARGET) -I $(INCPATH) -c --data_reorder \
                        --diag_suppress=2874 \
                        --asm

#dependency rule to state
IKDEPS_MAIN     =       CMSIS/Core/CM0/core_cm0.h CMSIS/Core/CM0/core_cm0.c cm0ikmcu.h IKtests.h IKtests.c IKConfig.h debug_i2c.h sporsho_tb.h Makefile
IKDEPS          =       $(IKDEPS_MAIN) debugdriver
IKOBJS          =       boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o

# Performance options (adds more options to compiler flag)
CCFLAGS         +=      -O2 -Otime -Ono_autoinline -Ono_inline

#Rules to create dependency tree
#top level target depends on TGT.bin
$(TGT):         $(TGT).bin => TGT depends on TGT.bin
                @echo => cmd line states that dont echo the cmd.
 
#expands to fromelf -4.0-821 --bin -o kail_rtsc.bin kail_rtsc.elf
$(TGT).bin:     $(TGT).elf
                $(FROMELF) --bin -o $@ $<

#expands to armlink -4.0-821 --map --ro-base=0x0 --rw-base=0x20000020 --symbols --first='boot.o(vectors)' --datacompressor=off --info=inline -o kail_rtsc.elf kail_rtsc.o boot.o system_cm0ikmcu.o retarget_cm0ikmcu.o IKtests.o debug_i2c.o sporsho_tb.o
$(TGT).elf:     $(TGT).o $(IKOBJS)
                $(LD) $(LDFLAGS) -o $@ $(TGT).o $(IKOBJS)

#expands to armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm  -O2 -Otime -Ono_autoinline -Ono_inline -o kail_rtsc.o kail_rtsc.c
$(TGT).o:       $(TGT).c $(IKDEPS)
                $(CC) $(CCFLAGS) -o $@ $< => $@ = TGT.o, $< = TGT.c

#all specifies what to run when no target is specified, i.e when we run just "make"
all:    debug

#similarly we specify rules for debug_i2c.o, sporsho_tb.o, boot.o, boot_evm.o, sporsho1_lib.o, retarget_cm0ikmcu.o, system_cm0ikmcu.o, system_cm0ikmcu_evm.o, IKtests.o.
#ex for IKtests. expands to armcc -4.0-821 --cpu=Cortex-M0 -I ./CMSIS/Core/CM0 -c --data_reorder --diag_suppress=2874 --asm -O2 -Otime -Ono_autoinline -Ono_inline -o IKtests.o IKtests.c
IKtests.o:      IKtests.c $(IKDEPS_MAIN)
                $(CC) $(CCFLAGS) -o $@ $<

#clean specifies that rm everything when running "make clean"
clean:
                rm -f *.bin *.elf *.o *.s *~
-----------------------------------

Advanced section:

1. Function for string substitution:

A. subst => simple substitution => $(subst from,to,text) => Performs a textual replacement on the text text: each occurrence of from is replaced by to. The result is substituted for the function call.

ex: Var1 = $(subst ee,EE,feet on the street) => new string "fEEt on the strEEt" assigned to Var1

B. patsubst => pattern substitution => $(patsubst pattern,replacement,text) => Finds whitespace-separated words in text that match pattern and replaces them with replacement. Here pattern may contain a ‘%’ which acts as a wildcard, matching any number of any characters within a word. If replacement also contains a ‘%’, the ‘%’ is replaced by the text that matched the ‘%’ in pattern. 

ex: Var2 = $(patsubst %.c,%.o,x.c.c bar.c) => .c replaced with .o, everything else is copied exactly as it's % on both pattern and replacement. so, final value assigned to Var2 is "x.c.o bar.o"

2. substitution reference: A substitution reference substitutes the value of a variable with alterations that you specify. It's identical to patsubst function above, so there's really no need for this, but it's provided for compatibility with some implementations of make.

Form => $(var:a=b)’ or ‘${var:a=b} => () or {} mean same thing. Takes the value of the variable var, replace every "a" at the end of a word with "b" in that value, and substitute the resulting string. Only those "a" that are at end of word (i.e followed by whitespace) are replaced. All other "a" remian unaltered. This form is same as patsubst => $(patsubst a,b, $(var))

foo := a.o b.o c.o
bar := $(foo:.o=.c) => It says to look at all words of var "foo", replace every word wherever "o" is the last character of that word with "c". Then assign this modified string to var "bar". So, bar gets set to "a.c b.c c.c". Here wild card matching of patsubst not used.
bar := $(foo:%.o=%.c) => Here wildcard char % is used for matching. So, bar gets set to "a.c b.c c.c". Here wild card matching of patsubst is used.

ex: following is in a makefile to create different target based on what TOP_MOD is being set to. TOP_MOD is assigned value from cmd line or from some other file: TOP_MOD := abc

target1_me: ${TOP_MOD:%=%.target1_me} ; => whenever we run target1.me from make cmdline, it calls this target. It has dependency specified within ${..}. Since this is substitution reference, the whole ${ .. } get assigned abc.target1_me. So, make looks for target abc.target1_me.
%.target1_me: ; echo "Hell"; => make finds this target as % expands to abc, so it starts running this target with whatever action it's supposed to do. In effect, we redirected flow to" abc.target1_me" target. This is helpful in cases where same target needs to be run with multiple times, but with different options. 

ex:

Difference in DC(design compiler) vs EDI(encounter digital implementation): ----------------------- 1. many of the cmds work on both DC and EDI. Biggest difference is in the way they show o/p. in all the cmds below, if we use tcl set command to set a variable to o/p of any of these cmds, then in DC it contains the actual object while in EDI, it contains a pointer and not the actual object. We have to do a query_objects in EDI to print the object. DC prints the object by using list. 2. Unix cmds don't work directly in EDI, while they do in DC. So, for EDI, we need to have "exec" tcl cmd before the linux cmd, so that it's interpreted by tcl interpreter within EDI. 3. Many new tcl cmd like "lassign", etc don't work in EDI. 4. NOTE: a script written for EDI will always work for DC as it's written as pure tcl cmds. Design compiler: --------------------- Register inference: (https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcrmo/dcrmo_8.html?otSearchResultSrc=advSearch&otSearchResultNumber=2&otPageNum=1#CIHHGGGG) ------- On doing elaborate on a RTL, HDL compiler (PRESTO HDLC for DC) reads in a Verilog or VHDL RTL description of the design, and translates the design into a technology-independent representation (GTECH). During this, all "always @" stmt are looked at for each module. Mem devices are inferred for flops/latches and "case" stmt are analyzed. After that, top level module is linked, all multiple instances are uniqified (so that each instance has unique module defn), clk-gating/scan and other user supplied directives are looked at. Then pass 1 mapping and then opt are done. unused reg, unused ports, unused modules are removed. #logic level opt: works on opt GTECH netlist. consists of 2 processes: A. structuring: subfunctions that can be factored out are optimized. Also, intermediate logic structure and variables are added to design B. Flattening: comb logic paths are converted to 2 level SOP, and all intermediate logic structure and variables are removed. This generic netlist has following cells: 1. SEQGEN cells for all flops/latches (i/p=clear, preset, clocked_on, data_in, enable, synch_clear, synch_preset, synch_toggle, synch_enable, o/p= next_state, Q) 2A. ADD_UNS_OP for all unsigned adders/counters comb logic(i/p=A,B, o/p=Z). these can be any bit adders/counters. DC breaks large bit adders/counters into small bit (i.e 8 bit counter may be broken into 2 counters: 6 bit and 2 bit). Note that flops are still implemented as SEQGEN. Only the combinatorial logic of this counter/adder (i.e a+b or a+1) is impl as ADD_UNS_OP, o/p of which feeds into flops. 2B. MULT_UNS_OP for unsigned multiplier/adder? 2C. EQ_UNS_OP for checking unsigned equality b/w two set of bits, GEQ_UNS_OP for greater than or equal (i/p=A,B, o/p=Z). i/p may be any no. of bits but o/p is 1 bit. 3. SELECT_OP for Muxes (i/p=data1, data2, ..., datax, control1, control2, ..., controlx, o/p=Z). May be any no. of i/p,o/p. 4. GTECH_NOT(A,Z), GTECH_BUF, GTECH_TBUF, GTECH_AND2/3/4/5/8(A,B,C,..,Z), GTECH_NAND2/3/4/5/8, GTECH_OR2/3/4/5/8, GTECH_NOR2/3/4/5/8, GTECH_XOR2/3/4, GTECH_XNOR2/3/4, GTECH_MUX*, GTECH_OAI/AOI/OA/AO, GTECH_ADD_AB(Half adder: A,B,S,COUT), GTECH_ADD_ABC(Full adder: A,B,C,S,COUT), GTECH_FD*(D FF with clr/set/scan), GTECH_FJK*(JK FF with clr/set/scan), GTECH_LD*(D Latch with clr), GTECH_LSR0(SR latch), GTECH_ISO*(isolation cells), GTECH_ONE/ZERO, for various cells. DesignWare IP (from synopsys) use these cells in their implementation. NOTE: in DC gtech netlist, we commonly see GTECH gates as NOT, BUF, AND, OR, etc. Flops, latches, adders, mux, etc are rep as cells shown in bullets 1-4 above. 5. All directly instantiated lib components in RTL. 6. If we have designware license, then we also see designware elemnets in netlist. All designware are rep as DW*. For ex: DW adder is DW01_add (n bit width, where n can be passed as defparam or #). Maybe *_UNS_OP above are designware elements. #gate level opt: works on the generic netlist created by logic level opt to produce a technology-specific netlist. consists of 4 processes: A. mapping: maps gates from tech lib to gtech netlist. tries to meet timing/area goal. B. Delay opt: fix delay violations introduced during mapping. does not fix design rule or opt rule violations C. Design rule fixing: fixes Design rule by inserting buffers or resizing cells. If necessary, it can violate opt rules. D. Opt rule fixing: fixes opt rule, once the above 3 phases are completed. However, it won't fix these, if it introduces delay or design rule violations. ------- In GTECH, both registers and latches are represented by a SEQGEN cell, which is a technology-independent model of a sequential element as shown in Figure 8-1. SEQGEN cells have all the possible control and data pins that can be present on a sequential element. FlipFlop or latch are inferred based on which pins are actually present in SEQGEN cell. Register is a latch or FF. D-Latch is inferred when resulting value of o/p is not specified under all consditions (as in incompletely specified IF or CASE stmt). SR latches and master-slave latches can also be inferred. D-FF is inferred whenever sensitivity list of always block or process includes an edge expression(rising/falling edge of signal). JK FF and Toggle FF can also be inferred. #_reg is added to the name of the reg from which ff/latch is inferred. (i.e count <= .. implies count_reg as name of the flop/latch) o/p: Q and QN (for both flop and latch) i/p: 1. Flop: clear(asynch_reset), preset(async_preset), next_state(sync data Din), clocked_on(clk), data_in(1'b0), enable(1'b0 or en), synch_clear(1'b0 or sync reset), synch_preset(1'b0 or sync preset), synch_toggle(1'b0 or sync toggle), synch_enable(1'b1) 2. Latch: clear(asynch_reset), preset(async_preset), next_state(1'b0), clocked_on(1'b0), data_in(async_data Din), enable(clk), synch_clear(1'b0), synch_preset(1'b0), synch_toggle(1'b0), synch_enable(1'b0) Ex: Flop in RTL: always @(posedge clkosc or negedge nreset) if (~nreset) Out1 <= 'b0; else Out1 <= Din1; Flop replaced with SEQGEN in DC netlist: clear is tied to net 0, which is N35. preset=0, since no async preset. data_in=0 since it's not a latch. sync_clear/sync_preset/sync_toggle also 0. synch_enable=1 means it's a flop, so enable if used, is sync with clock. enable=0 as no enable in this logic. \**SEQGEN** Out1_reg ( .clear(N35), .preset(1'b0), .next_state(Din1), .clocked_on(clkosc), .data_in(1'b0), .enable(1'b0), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b1) ); Ex: Latch in RTL always @(*) if (~nreset) Out1 <= `b0; else if(clk) Out1 <= Din1; Latch replaced with SEQGEN in DC netlist: all sync_* signals set to 0 since it's a latch. synch_enable=0 as enable is not sync with clk in a latch. enable=clk since it's a latch. \**SEQGEN** Out1_reg ( .clear(N139), .preset(1'b0), .next_state(1'b0), .clocked_on(1'b0), .data_in(Din1), .enable(clk), .Q(Out1), .synch_clear(1'b0), .synch_preset(1'b0), .synch_toggle(1'b0), .synch_enable(1'b0) ); NOTE: flop has both enable and clk ports separate. sync_enable is set to 1 for flop (and 0 for latch). That means, lib cells can have Enable and clk integrated into the flop. If we have RTL as shown below, it will generate a warning if there is no flop with integrated enable in the lib. ex: always @(posedge clk) if (en) Y <= A; //This is a flop with enable signal. warning by DC: The register 'Y_reg' may not be optimally implemented because of a lack of compatible components with correct clock/enable phase. (OPT-1205). => this will be implemented with Mux and flop as there's no "integrated enable flop" in library. #Set the following variable in HDL Compiler to generate additional information on inferred registers: set hdlin_report_inferred_modules verbose Example 8-1 Inference Report for D FF with sync preset control (for a latch, type changes to latch) ====================================================================== |Register Name | Type |Width | Bus | MB | AR | AS | SR | SS | ST | ========================================================== | Q_reg | Flip-flop | 1 | N | N | N | N | N | Y | N | ====================================================================== Sequential Cell (Q_reg) Cell Type: Flip-Flop Width: 1 Bus: N (since just 1 bit) Multibit Attribute: N (if it is multi bit ff, i.e each Q_reg[x] is a multi bit reg. in that case, this ff would get mapped to cell in .lib which has ff_bank group) Clock: CLK (shows name of clk. For -ve edge flop, CLK' is shown as clock) Async Clear(AR): 0 Async Set(AS): 0 Async Load: 0 Sync Clear(SR): 0 Sync Set(SS): SET (shows name of Sync Set signal) Sync Toggle(ST): 0 Sync Load: 1 #Flops can have sync reset (there's no concept of sync reset for latches). Design Compiler does not infer synchronous resets for flops by default. It will see sync reset signal as a combo logic, and build combo logic (with AND gate at i/p of flop) to build it. To indicate to the tool that we should use existing flop (with sync reset), use the sync_set_reset Synopsys compiler directive in Verilog/VHDL source files. HDL Compiler then connects these signals to the synch_clear and synch_preset pins on the SEQGEN in order to communicate to the mapper that these are the synchronous control signals and they should be kept as close to the register as possible. If the library has reg with sync set/reset, then these are mapped, else the tool adds extra logic on D i/p pin (adds AND gate) to mimic this behaviour. ex: //synopsys sync_set_reset "SET" => this put in RTL inside the module for DFF. This says that pin SET is sync set pin, and SEQGEN cell with clr/set should be used. #Latches and Flops can have async reset. DC is able to infer async reset for flop (by choosing SEQGEN cell with async clear and preset connected appr), but for latches, it's not able to do it (it chooses SEQGEN cell with async clear/preset tied to 0). This is because it sees clear/preset signal as any other combo signal, and builds combo logic to support it. DC maps SEQGEN cell (with clr/preset tied to 0) to normal latch (with no clr/set) in library, and then adds extra logic to implement async set/reset. It actually adds and gate to D with other pin connected to clr/set, inverter on clr/set pin followed by OR gate (with other pinof OR gate tied to clk). So, basically we lose advantage of having async latch in .lib. To indicate to the tool that we should use existing latch (with async reset), use the async_set_reset Synopsys compiler directive in Verilog/VHDL source files. ex: //synopsys async_set_reset "SET" => this says pin SET is async set/reset pin, and SEQGEN cell with clr/set should be used. #infer_multi_bit pragma => maps registers, multiplexers and 3 state drivers to multibit libraray cells. #stats for case stmt: shows full/parallel for case stmt. auto means it's full/parallel. A. full case: all possible branches of case stmt are specified. otherwise latch synthesized. non-full cases happen for state machines when states are not multiple of 2^n. In such cases, unused states opt as don't care. B. parallel case: only one branch of case stmt is active at a time (i.e case items do not overlap). It may happen when case stmt have "x" in the selection, or multiple select signals are active at same time (case (1'b1) sel_a:out=1; sel_b: out=0;). If more than 1 branch active, then priority logic built (sel_a given priority over sel_b), else simple mux synthesized. RTL sim may differ from gate sim, for a non-parallel case. #The report_design command lists the current default register type specifications (if we used "set_register_type" directive to set flipflop/latch to something from library) . dc_shell> report_design ... Flip-Flop Types: Default: FFX, FFXHP, FFXLP #MUX_OPs: listed in report_design. MUXOPs are multiplexers with built in decoders. Faster than SELECT_OPs as SELECT_OPs have decoding logic outside. ex: reg [7:0] flipper_ram[255:0]; => 8 bit array of ram from 0 to 255 assign p1_rd_data_out = flipper_ram[p1_addr_in]; => rd 7 bits out from addr[7:0] of ram. equiv to rd_data[7:0] = ram[addr[7:0] ]. this gives the following statistics for MUX_OPs generated from previous stmt. (MUX_OPs are used to implement indexing into a data variable, using a variable address) =========================================================== | block name/line | Inputs | Outputs | # sel inputs | MB | =========================================================== | flipper_ram/32 | 256 | 8 | 8 | N | => 8 bit o/p (rd_data), 8 bit select (addr[7:0]), 256 i/p (i/p refers to distinct i/p terms that mux is going to choose from, so here there are 256 terms to choose from, no. of bits for each term is already indicated in o/p (8 bit o/p) ) =========================================================== #list_designs: list the names of the designs loaded in memory, all modules are listed here. #list_designs -show_file : shows the path of all the designs (*.db in main dir) -------------------------- Optimizatio pririty in DC -------------------------- Uses cost types to optimize design. Cost types are design rule cost and optimization cost. By default, highest pririty to design rule cost (top one) and then pririty goes down as we move to bottom ones. 1. design rule cost => constraints are DRC (max_fanout, max_trans, max_cap, connection class, multiple port nets, cell degradation) 2. opt cost: A. delay cost => constraints are clk period, max_delay, min_delay B. dynamic power cost => constraints are max dynamic power C. leakage power cost => constraints are max lkg power D. area cost => constraints are max area ------------------------ #terminology within Synopsys. https://solvnet.synopsys.com/dow_retrieve/F-2011.06/dcug/dcug_5.html #designs => ckt desc using verilog HDL or VHDL. Can be at logic level or gate level. can be flat designs or hier designs. It consists of instances(or cells), nets (connects ports to pins and pins to pins), ports(i/o of design) and pins (i/o of cells within a design). It can contain subdesigns and library cells. A reference is a library component or design that can be used as an element in building a larger circuit. A design can contain multiple occurrences of a reference; each occurrence is an instance. The active design (the design being worked on) is called the current design. Most commands are specific to the current design. #to list the names of the designs loaded in memory dc_shell> list_designs a2d_ctrl digtop (*) spi etc => * shows that digtop is the current design dc_shell> list_designs -show_file => shows memory file name corresponding to each design name /db/Hawkeye/design1p0/HDL/Synthesis/digtop/digtop.db digtop (*) /db/Hawkeye/design1p0/HDL/Synthesis/digtop/clk_rst_gen.db clk_rst_gen #The create_design command creates a new design. dc_shell> create_design my_design => creates new design but contains no design objects. Use the appropriate create commands (such as create_clock, create_cell, or create_port) to add design objects to the new design.

Synopsys/Standard design constraints (SDC)

SDC is a subset of the design constraint commands already supported by many CAD tools. SDC was agreed on as a standard, since diff tool vendors had their own synthesis/timing constraint cmds, which made it difficult to port these constraints. Since most of the constraints for synthesis, timing, etc are standard (i.e define clock, port delays, false paths, etc), it just made sense to have standard constraints that would be supported by all vendors. Synopsys and Cadence Synthesis, timing, PnR, etc tools support these SDC cmds.
SDC versions are 1.2, 1.3 .. 2.0. In write_sdc (in both synopsys and cadence tools), we can specify version of sdc file to write (default is to use latest version). SDC cmds started with having design constraint cmds only, but over time expnaded to include cmds pertaining to reporting, collection, get objects, etc.

It's been 2022, and still no one including me, knows the full form of SDC !! Synopsys, which had these cmds initially in their synthesis/timing tools, allowed them to become a standard. Cadence and other companies grudgingly accepted it, and called it Standard design constraints (but don't mention the full form anywhere). But Synopsys website still refer to these as "Synopsys Design constraints".

Before we learn these cmds, let's go over few basics of design that these cmds work on.

Objects:

Objects are anything in design like ports, cells, nets, pins, etc. Valid objects are design, port , cell, pin, net , lib, lib_cell, lib_pin, and clock. Each of these objects may have multiple attribute. As an ex, each gate may have 100's of attributes as gate_name, size, pins, delay_arcs, lib_name, etc. These objects are further put into several class as ports, cells, nets, pins, clocks, etc. Most commands operate on these objects.

Collections:

Synopsys applications build an internal database of objects and attributes applied to them. Cadence applications also build a similar internal database. but the internal representation may be different, and process to access them may be different. So arguments of some of the sdc commands may be different across different vendors, even though they may support the basic sdc cmd. Same sdc file in Synopsys may not be used directly in cadence tools, as many of these collection cmds (cmds that work on collection of objects) may have some part of code (cmds that were used to make a collection) that may not be recognized by Cadence tools as valid. The situation is improving, but this is something that needs to be kept in mind. Always read the SDC manual of a vendor to find out what sdc cmds and syntax it supports.

Definition: A collection is a group of objects exported to the Tcl user interface. Collections are tcl extension provided by EDA vendors (Synopsys/Cadence) to support list of objects in their Tcl API. Most of the design cmds work on these collection of objects. In Tcl language, we have 2 composite data types, "list" and "array", that allows us to put multiple elements into a single group. Collections can be thought of as similar to Tcl list, though they are not interchangeable, as internals of the 2 are different. However, many cmds take both list and collection as i/p, not differentiating bwteen the 2. This is done for user convenience. However, internally the list is converted to a collection, and o/p of the cmd is always given out as collection. Visually collections look like list (when displayed on the screen), but they are NOT list.

A set of commands to create and manipulate collections is provided as an integral part of the user interface. The collection commands encompass two categories: those that create collections of objects for use by another command, and other that queries objects for viewing. These two types of collection cmds are:

1. create/manipulate objects: add_to_collections, append_to_collection, remove_from_collection, sizeof_collection, foreach_in_collection, sort_collection, compare_collection, copy_collection, filter_collection, etc are few common collection cmds. These cmds work on collection of objects to manipulate that object list and create new collection of objects. As such, these cmds may be used as i/p to other cmds that expect collection of objects as it's i/p. Collections return a pointer to that collection (i.e 0x78 is what you see on the screen when creating a collection), which is used by other cmds. In all these collection cmds below, the collections provided to the cmd themselves remain unchanged: A new collection is created which can either be passed to other cmds, or may be assigned to a var, which becomes the new collection.

  1. create/remove collection: Collections can be created by starting with an empty collection, and using "add_to_collection" or "append_to_collection" (both being similar, although append is more efficient in some situations as per PT manual). Adding an implicit list of only strings or heterogeneous collections to the empty collection generates an error message, because no homogeneous collections (collection of same class, i.e either all ports, or all cells, etc) are present in the object_spec list. Finally, as long as one homogeneous collection is present in the object_spec list, the command succeeds, even though a warning message is generated. If the base collection to which we are adding the new collection is not empty, then heterogenous objects (i.e ports, cells) from second collection are added to the base collection only if base collection also has heterogenous objects. So, rules get complicated, and to see what got added/deleted, we should always  instead of relying on add or append cmd, it's easier to just make a collection using get_ports, get_cells, etc which implicitly make a collection of items. 
    1. add_to_collection: Creates a new collection by concatenating objects from the base collection and the second collection. base_collection needs to be a collection, while second collection may be collection or object list. The base collection and second collection remain unmodified, i.e these collections are appended and new collection is made which can either be passed to other cmds, or may be assigned to a var, which becomes the new collection. This cmd returns AUB (i.e union of A and B)
      • Syntax: add_to_collection <base_coll> <second_coll> => -unique option removes the duplicate objects from the new collection. 
        • ex in PT: pt_shell> add_to_collection "" [list port1 port2] => This adds design objects "port1" and "port2" to the empty collection and creates a new collection which has no name, and if not passed to a cmd, does nothing. We can create a new coll, by giving it a name as => set var1 [add_to_collection "" [list port1 port2]]
        • pt_shell> set port_coll [list port1 port2] => Here new collection port_coll  is formed which has the 2 ports in that coll. These ports may or may not be valid. Here we didn't have to explicitly create a collection as the o/p of [...] is converted to a  collection, the pointer to which is set to port_coll. echo $port_coll returns a pointer as 0x87 (in Cadence tools) or _sel3555 (in Synopsys tools). However on screen it shows {"port1", "port2"}. We could also use get_ports cmd whose o/p is a port list collection. That's a preferred way as that guarantees correct collection being provided as i/p to collection list. i.e
        • pt_shell> set port_coll [get_ports [list port_a port_b[*]]  => Here get_ports o/p is a coll of => {"port_a", "port_b[1]", "port_b[1]"}. We set this coll to port_coll
        • pt_shell> set port_coll [list $port1 $cell2] => here port1, cell2 need to be valid object names or collection.  may give an error as variables $port1 $cell2 might be pointers to collections and list , etc .One way to guarantee that is to have these var from the o/p of get_port, get_cell or similar cmds which return collection as o/p. If not, we get an Error in PT => At least one collection required for argument 'object_spec' to add_to_collection when the 'collection' argument is empty (SEL-014)
    2. append_to_collection: Since add_to_collection doesn't modify the original collection, this cmd allows you to append the original coll, which is useful in many cases. This command can be much more efficient than the add_to_collection command if you are building up a collection in a loop.
      • Syntax: append_to_collection var_name <obj_or_coll> => -unique option removes the duplicate objects from the new collection. NOTE: var may or may not be defined, may be a existing coll, or an empty coll. If the var does exist and it does not contain a collection, it is an error. We do NOT have $ in front of var name, as we are modifying or defining this var (we are not accessing the value of the var). This var becomes a coll once a coll is appended to it.
        • ex: append_to_collection my_ports [get_ports in*] => Here my_ports is a var, to which [get_ports in*] coll get added.
    3. remove_from_collection: To remove elements from collection.  Any element in 2nd coll is removed from base_coll. Similar to "add_to_collection", base collection and second collection remain unmodified. A new collection is created, which can be assigned to a var. Here, collections can't contain any list of objects, which is how add_to_collection behaved (it allowed lists as option)
      • Syntax: remove_from_collection <base_coll> <obj_coll> => o/p is A - (A∩B) since objects in <obj_coll> are removed from <base_coll>. -intersect option does the opposite => it removes obj in base coll that are not found in <obj_coll>. So, with this option, instead of providing o/p as A - (A∩B), it's A∩B (that's why it's called intersect). So, with remove and add cmd, we can find out all combo => AUB, A∩B, A - (A∩B) and B - (A∩B)
        • ex: remove_from_collection [concat [get_cells *] [get_ports *in*]] [get_cells I_*] => removes specified cells from coll of cells and ports
        • ex: set dports  [remove_from_collection [all_inputs] CLK] => all_inputs creates a collection of all i/p ports from which CLK is removed. T/he remaining collection pointer is assigned to var dports.
  2.  compare_collections: To compare 2 coll if they have same elements or not, we use this cmd. 0 indicates success (i.e same elements), while any other number indicates failure (diff elements). This behaviour is consistent with "string compare" in tcl, which returns 0 on match.
    • Syntax: compare_collection <coll1> <col2l> => With -order_dependent option, the collections are considered to be different if the objects are ordered differently.
      • compare_collections $c1 [get_ports out*] => Here c1 coll and ports coll are compared, and if they have same contents (order doesn't matter), then 0 is returned.
  3. foreach_in_collection: To iterate over the objects in a collection, use the foreach_in_collection command. You cannot use the Tcl- supplied foreach iterator to iterate over the objects in a collection, because the foreach command requires a list, and a collection is not a list. The o/p of any cmd that returns a collection is actually a pointer to that collection. The arguments of the foreach_in_collection command are similar to those of foreach: an iterator variable, the collection over which to iterate, and the script to apply at each iteration.
    • Ex: set A [get_ports] => returns these {"port[0]", "IN2", "SPI_CLK0"} => Object names are displayed just for readability. A still holds the pointer value. echo $A => returns a pointer _sel3555
    • foreach_in_collection tmp $A {echo [get_object_name $tmp]} => returns these  3 ports = port[0] IN2 SPI_CLK0. echo $tmp will return pointer value _sel3555 3 times. This is because tmp is just assigned the pointer value
    • foreach_in_collection tmp _sel3555 {echo [get_object_name $tmp]} => we get same result as above, since A is just storing the pointer to collection
    • create_fillers -lib_cells $A => These collections ($A) work just fine in cmds which expect to get a collection pointer as argument. No need to iterate thru each element of collection. Many cmds accept both collection as well as list for argument. So, no issues with either one.
    • set_false_path -to [get_pins mod1/*] => Here get_pins returns a pointer which points to a collection with all pins of mod1. Since set_false_path takes objects in it's args, it takes this collection of pins as an argument, and sets false path to all the pins, i.e set_false_path -to {mod1/pinA, "mod1/pinB[0]", ...}. However, if we want, we can iterate thru each pin of "get_pins" cmd using foreach_in_collection loop.
      • ex: foreach_in_collection tmp [get_pins mod1/*] {set_false_path -to [get_object_name $tmp]} => Here each object in collection is assigned to var "tmp" one by one, and then false path set to each of them separately. i.e set_false_path -to mod1/pinA, set_false_path -to "mod1/pinB[0]", ... => see how set_false_path is done separately for each object of collection. So, collections can be used as one, or can be divided into individual elements.
    • get_ports _sel3555 => since _sel3555 is a pointer to collection, cmd get_ports gets collection of ports from the collection pointed by _sel3555. That returns collection of ports similar to what "get_ports" returns by itself. This is just a convoluted way to show that it works.
  4. sizeof_collection: Very useful cmd to figure out the size of collection w/o iterating thru the whole list. To find the size of collection, one way is to use foreach_in_collection cmd above. We increment a counter inside the loop, and when that loop exits, the counter shows the number of elements in the collection. sizeof_collection provides an easier way to do that by using this single cmd instead of a loop. Sometimes collections will be empty, and in such cases, we want to know beforehand. Otherwise our collection cmds will error out since all these cmds expect valid collection pointer. In such cases, sizeof_collection is useful too.
    • ex: if { [sizeof_collection [get_ports -quiet SPI*]] != 0} { foreach_in_collection tmp [get_ports -quiet SPI*] { do operation } } else { echo "collection empty" } => If we directly used foreach_in_collection on an empty list, then the tool would report an error saying the collection is empty. We avoid that by using if else.
  5. filter_collection: Most used cmd to filter objects from any collection based on specified filtering criteria. It allows objects to be filtered out before they are ever included in the collection. An alternative -filter option for many cmds is also provided for many apps, see details in later section. 
    • ex: filter_collection  [get_cells *] "is_hierarchical == true && ref_name=~*AN2*" => get_cells creates a collection of all cells in design, and then filters out those whose hierarchy is set to "true", and the reference cells have pattern AN2 int hem. So, it finally gets all AND2 leaf cells. NOTE: double quotes are only at start and end of filtering (not with each filter expr)
    • ex: filter_collection $coll -regexp "full_name =~ ^${softip}/eco_\[a-z\]+_icd/.* or full_name =~ ^${myip}/.*" => here regex used, so we use .* instead of *. Also, keywords "and", "or" may be used isntead of &&, ||, etc. $coll is a collection generated from o/p of some prior cmd.

2. query objects:

We can do "echo" of a list, and it will print the list, but with collection, a simple "echo" won't return the list. When we do "echo" on collection, we only get a pointer to collection. However, on screen, we do see o/p similar to echo of list (where it's listing names of objects in collection), whenever we run any cmd that outputs a collection. This is done by vendor tools just for convenience purpose. CAD tools by default call an implicit "query_objects" cmd, whenever any cmd that outputs a collection is run. A default limit of 100 is set as the max number of objects that can be displayed (controlled by a variable "collection_result_display_limit" whose default value of 100 can be changed to anything we want). Though this works for most viewing purpose, we can't use this printed o/p within a script as the cmd itself just returns a pointer to collection and not a list of objects in the collection. In order to see what's stored in collection, we can also use a built in collection proc "query_objects". "query_objects" takes as i/p either a collection or a pattern. By default, query_object displays names of objects (-verbose option displays class of each object as cell, net, etc too). Again, this is also for display purpose only, and it's o/p can't be used in scripts, as it always returns an empty string. For getting names of objects to be used in a script, we have to use function like "get_object_name" on the collection. There are many other cmds for getting other attributes of each object in the collection.

  • ex: query_objects [get_cells o*] => ["or1", "or2"]. get_cells returns a collection which is then passed to query_objects which gets the names and outputs it as a list. Here query_objects is passed a collection as it's i/p. Here o/p is in default legacy format
  • ex: query_objects -class cell U* => [U1 U2]. When i/p to query_objects is a pattern, we have to specify a class as cell, net, etc (classes are application specific). Here o/p is in tcl format (i.e tcl list with no commas etc), which is possible by setting thisapp var => set_app_var query_objects_format Tcl

All below cmds expect collection, and so can only be passed a pointer to collection. Names or patterns as options will give an error. This is diff than query_objects cmd above which can take patterns as an i/p too.

  1. get_object_name => convenient way to get the full_name attribute of objects in a collection. When multiple objects passed as i/p, o/p is returned as a list. Cadence supports this as this cmd was added in SDC for compatibility while reading in SDC files with non-SDC constraints. However, it just returns specified i/p args provided.
    1. ex: pt_shell> get_object_name [get_cells i*] => ["inx1", "ind2"]. get_cells returns a collection which is then passed to get_object_name.
    2. ex: get_object_name "in1/cell2" => errors out since "in1/cell2" is not a collection pointer. To make it work, do: get_object_name [get_cells in1/cell2]

-------------