Why is BTET transaction processing overhead in Teradata? Answer: - BTET makes all the queries running between BT and ET as single transaction . If any of query fails then the entire set of queries will not be committed. - BTET also has an overhead with locking , since it holds locks on tables till the ET is occured or all the queries have executed successfully - DDL statements cannot be used everywhere in BTET processing , but these statements can be given towards the end of BTET transaction. - Using large number of BTET caused transient Journal to grow and has tendancy for System Restarts ______________________________________________________________________________________ What are the options that are not available for global temporary tables ? GTT definitions are created in Data dictionary. These table provide separate instance to each user who refers to them . The following options are not available for global temporary tables: - Any kind of constraints like check/referential cannot be applied on table - Identity columns since data in GTT are materialized only for session - Permanent Journaling cannot be done as data in tables are instances only to user in that session - PPI cannot be applied as data does not get stored in PERM , only TEMP space is utilized here. _______________________________________________________________________________________ What are the options not available for volatile tables in teradata ? The following options are not available for volatile tables because table definition is not stored in data dictionary - Default values for columns - Title clause for columns - Named Indexes for table - Compression on columns/table level since table data are spooled - Stats cannot be collected since data is materialized only for session - Identity columns as these again would need entry in IDcol tables - PPI cannot be done on tables - Any kind of constraints like check/referential cannot be applied on table _____________________________________________________________________________ What are permanent journals in teradata? - Journals are used to capture information about table in Teradata. In case of Permanent journals they capture details of Journal enabled tables in teradata with all the pre transaction and post transaction details . - Journal tables are assigned PERM space and they reside in same database as of parent or they can reside on different database. - They are mainly used for protection of data and sometimes also for disaster recovery ( fallback is better in this case ) - Permanent journal tables can be enabled or disabled by running alter database <databasename> 'no journal' /' journal = <databasename.jrnltbl>' - Arcmain utility provides the feature of backing up Journal tables - We can find details about all journal tables present in teradata database using DBC.JOURNALS table. Posted by Vinay SHet at 12:11 AM 1 comment: Email ThisBlogThis!Share to TwitterShare to Facebook _______________________________________________________________________________1 Query to find skew factor of a particular table ? SELECT TABLENAME ,SUM(CURRENTPERM) /1024/1024 AS CURRENTPERM, (100 - (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR FROM DBC.TABLESIZE WHERE DATABASENAME= <DATABASENAME> AND TABLENAME =<TABLENAME> GROUP BY 1; _______________________________________________________________________________________ What is the acceptable range for skew factor in a table? There is no particular range for skew factor. In case of production systems, it is suggested to keep skew factor between 5-10. There are various considerations for skew factor - Number of AMPS - Size of row in a table - number of records in a table - PI of a table - Frequent access of table (Performance consideration) - whether table getting loaded daily /monthly or how frequently data is being refreshed Posted by Vinay SHet at 8:50 AM No comments: Email ThisBlogThis!Share to TwitterShare to Facebook ____________________________________________________________________________________________________ What is multivalued compression in Teradata? Multivalued compression or just MVC is a compression technique applied on columns in Teradata . MVC has a unique feature of compressing up-to 255 distinct values per column in a given table. The advantage of compression are Reduced Storage cost by storing more of a logical data than physical data. Performance is greatly improves due to reduced retrieval of physical data for that column. Tables having compression always have an advantage since optimizer considers reduced I/O as one of the factors for generating EXPLAIN plan. ________________________________________________________________________________________________________ What are the ways by which we can use zero to replace a null value for a given column ? Answer - 1. By using Teradata SQL supported command as follows Select Col1, ZEROIFNULL(Col2) from Table_name; 2. By using ANSI SQL command as follows a. Coalesce Select Col1,COALESCE(Col2,0) from Table_name; b.Case operator select Case When Col2 IS NOT NULL Then Col2 Else 0 End from Table_name; It is always suggested to use ANSI standard while coding in Teradata , since any changes in Teradata version due to upgrade/patches installation will lead to - Time for regression testing - rework of code. _______________________________________________________________________________________________________ What is a join index ? What are benefits of using Join index? Answer: It is a index that is maintained in a system .It maintains rows joined on two or more tables. Join index is useful for queries where the index structure contains all the columns referenced by one or more joins, thereby allowing the index to cover all or part of the query Benefits if using join index is - to eliminate base table access - Aggregate processing is eliminated by creating aggregate join index - It reduces redistribution of data since data is materialized by JI. - Reduces complex join conditions by using covering queries ____________________________________________________________________________________________________________ Can a macro be called inside a macro? Answer: The main purpose of run a set of repeated sql queries. Macro supports only DML queries . Hence We cant call any-other macro or not even a procedure in a macro. One trick to have closest functionality of this is to copy all the sql queries from macro2 to macro1 and add parameters if it is necessary as shown below. replace macro1( val int) as ( sel * from employee_table where empid= :val ); replace macro2( dept_no int) as ( sel * from employee_table where deptno= :dept_no ); so, to call a macro2 inside a macro1..it is not possible. Hence follow this approach Replace macro1 (val int, dept_no int) as ( sel * from employee_table where empid= :val; sel * from employee_table where deptno= :dept_no ; ); ______________________________________________________________________________________________________________________________ How do you find the list of employees named "john" in an employee table without using Like operator?? This question seems tricky.. but yes there is another way by which we can find names/patters without using like operator. By using "BETWEEN" , we can find the lsit of employees named john... sel * from employee where name betwee 'J' and 'K'; But at times usage of between is tricky, if there are other employees starting with J, those employees will also be listed by this query. Posted by Vinay SHet at 10:05 AM 2 comments: Email ThisBlogThis!Share to TwitterShare to Facebook how do you list all the objects available in given database? 1.)select * from dbc.tables where databasename='<DATABASENAME>'; 2.) By running a normal help command on that database as follows. help database <DATABASENAME'; ______________________________________________________________________________________________________________________________ What are different types of Spaces available in Teradata ? There are 3 types of Spaces available in teradata ,they are 1. Perm space -This is disk space used for storing user data rows in any tables located on the database. -Both Users & databases can be given perm space. -This Space is not pre-allocated , it is used up when the data rows are stored on disk. 2.Spool Space -It is a temporary workspace which is used for processing Rows for given SQL statements. -Spool space is assigned only to users . - -Once the SQL processing is complete the spool is freed and given to some other query. -Unused Perm space is automatically available for Spool . 3. TEMP space -It is allocated to any databases/users where Global temporary tables are created and data is stored in them. -Unused perm space is available for TEMP space ________________________________________________________________________________________________________________________________ What is hash collision ? This occurs when there is same hash value generated for two different Primary Index Values. It is a rare occurance and Has been taken care in future versions of TD. Posted by Vinay SHet at 5:47 AM No comments: Email ThisBlogThis!Share to TwitterShare to Facebook _________________________________________________________________________________________________________________________________ What is RAID, What are the types of RAID? Redundant Array of Inexpensive Disks (RAID) is a type of protection available in Teradata. RAID provides Data protection at the disk Drive level. It ensures data is available even when the disk drive had failed. Th\ere are around 6 levels of RAID ( RAID0 to RAID5) . Teradata supports Two levels of RAID protection RAID 1 - Mirrored copy of data in other disk RAID 5 - Parity bit (XOR) based Data protection on each disk array. One of the major overhead's of RAID is Space consumption Posted by Vinay SHet at 4:59 AM No comments: Email ThisBlogThis!Share to TwitterShare to Facebook ______________________________________________________________________________________________________________________________________ How do you find the No of AMP 's in the teradata Database? 1.) You can do a SELECT HASHAMP()+1 to get the total number of Amps in the given teradata system. 2.) Details about amps can also be checked in Configuration management on teradata PMON tool. __________________________________________________________________________________________________________________________________________ How do you see a DDL for an existing table in Teradata? By using show table command as follows show table tablename ; This will display DDL structure of table along with following details Fallback Before/after journal set/multiset table type Index(PI,SI,PPI) details about column - datatype ,default,identity/primary -foreign key . ___________________________________________________________________________________________________________________________________________ How to find duplicates in a table? To find duplicates in the table , we can use group by function on those columns which are to be used and then listing them if their count is >1 . Following sample query can be used to find duplicates in table having 3 columns select col1, col2,col3, count(*) from table group by col1, col2, col3 having count (*) > 1 ; _______________________________________________________________________________________________________________________________________________ what are different types of journals in teradata? There are 3 different types of journals available in Teradata. They are 1. Transient Journal - This maintains current transaction history. Once the query is successful it deletes entries from its table . If the current query transaction fails, It rolls back data from its table. 2. Permanent Journal - This is defined when a table is created. It can store BEFORE or AFTER image of tables. DUAL copies can also be stored. Permanent Journal maintains Complete history of table. 3.Down AMP recovery Journal (DARJ) - This journal activates when the AMP which was supposed to process goes down. This journal will store all entries for AMP which went down. Once that AMP is up, DARJ copies all entries to that AMP and makes that AMP is sync with current environment. _________________________________________________________________________________________________________________________________________________ What are the reasons for product joins ? 1. Stale or no stats causing optimizer to use product join 2. Improper usage of aliases in the query. 3. missing where clause ( or Cartesian product join 1=1 ) 4. non equality conditions like > ,< , between example ( date) 5. few join conditions 6. when or conditions are used. last but not the least product joins are not bad always!! sometimes PJ are better compared to other types of joins. _____________________________________________________________________________________________________________________________ How to rename columns using views? Create a view which is a subset of the employee table as follows. Renaming columns in views will help increase security of sensitive tables and hiding columns under alias names Replace view EmployeeV( number,fullname,addr,phno,depnum,sal, expr) as locking row for access sel EmpNo , Name , Address, Phone , DeptNo , Salary , YrsExp from employee; ___________________________________________________________________________________________________________________________ how do you you implement Multi valued compression in an existing table? MVC can be implemented on following conditions 1. A new column with multi valued compression can be added to an existing table, but cannot modify existing compressed column. 2. Create a new table with column which has MVC and do insert .. select from original table Or CREATE TABLE... as with column designated MVC. ___________________________________________________________________________________________________________________________ list Built-in functions used in teradata ? The main functionality of built in functions is that they dont need any arguments or paramaters and can be used directly with select to return system values. Some of them are • SESSION: – Returns a number for the session for current user . • TIME: – this function provides the current time based on a 24-hour day • USER: – This one gives the user name of the current user. • ACCOUNT: – display's your Teradata Account information • CURRENT_DATE: – Returns the current system date • CURRENT_TIME: - This function returns the current system time and current session ‘Time Zone’ displacement. • CURRENT_TIMESTAMP: - Returns the current system timestamp with TimeZone • DATABASE: – It returns the name of the default database for the current user. • DATE: – same as Current_DATE and is teradata built in . ___________________________________________________________________________________________________________________________ How do you whether table is locked or not? Just run the following query on the table. Lock Table DBNAME.TABLENAME write nowait Select * from DBNAME.TABLENAME; If this table is locked currently then , then the above statement would fail as Nowait will not wait for locks to be released on that table . ___________________________________________________________________________________________________________________________ How many error tables are there in fload and Mload and what is their significance/use? Answers: Fload uses 2 error tables ET TABLE 1: where format of data is not correct. ET TABLE 2: violations of UPI It maintains only error field name, error code and data parcel only. Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table 1. ET TABLE - Data error MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task. 2. UV TABLE - UPI violations MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task Apart from error tables, it also has work and log tables 3. WORK TABLE - WT Mload loads the selected records in the work table 4. LOG TABLE A log table maintains record of all checkpoints related to the load job, it is essential/mandatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason. ___________________________________________________________________________________________________________________________ How do you find out number of AMP's in the Given system Answer 1.running following query in queryman Select HASHAMP () +1; 2. We can find out complete configuration details of nodes and amps in configuration screen of Performance monitor ___________________________________________________________________________________________________________________________ What are the difference types of temporary tables in Teradata? Answers: a. Global temporary tables b. Volatile temporary tables c. Derived tables Global Temporary tables (GTT) – 1. When they are created, its definition goes into Data Dictionary. 2. When materialized data goes in temp space. 3. That's why, data is active up to the session ends, and definition will remain there up-to its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all; 4. You can collect stats on GTT. 5. Defined with the CREATE GLOBAL TEMPORARY TABLE sql Volatile Temporary tables (VTT) - 1. Local to a session (deleted automatically when the session terminates) 2. Table Definition is stored in System cache .A permanent table definition is stored in the DBC data dictionary database (DBC.Temptables) . 3. Data is stored in spool space. 4. That’s why; data and table definition both are active only up to session ends. 5. No collect stats for VTT.If you are using volatile table, you can not put the default values on column level (while creating table) 6. Created by the CREATE VOLATILE TABLE sql statement Derived tables 1 Derived tables are local to an SQL query. 2 Not included in the DBC data dictionary database, the definition is kept in cache. 3 They are specified on a query level with an AS keyword in an sql statement ___________________________________________________________________________________________________________________________ List types of HASH functions used in Teradata? Answer: SELECT HASHAMP (HASHBUCKET (HASHROW ())) AS “AMP#”, COUNT (*) FROM GROUP BY 1 ORDER BY 2 DESC; There are HASHROW, HASHBUCKET, HASHAMP and HASHBAKAMP. The SQL hash functions are: * HASHROW (column(s)) * HASHBUCKET (hashrow) * HASHAMP (hashbucket) * HASHBAKAMP (hashbucket) Example: SELECT HASHROW ('Teradata') AS "Hash Value" , HASHBUCKET (HASHROW ('Teradata')) AS "Bucket Num" , HASHAMP (HASHBUCKET (HASHROW ('Teradata'))) AS "AMP Num" , HASHBAKAMP (HASHBUCKET (HASHROW ('Teradata'))) AS "AMP Fallback Num" ; This is really good, by looking into the result set of above written query you can easily find out the Data Distribution across all AMPs in your system and further you can easily identify un-even data distribution Detailed Explanation...... They are 4 Type of Hash Function. 1.HashRow:- Returns the rowhash value of a given sequence of expressions in BYTE(4) data type. ex:- SEL HASHROW(department_number)FROM DEPT --------------------------------------- 79EAC67E 02C65E9D 08E241A3 9252398E BBDC9616 93604E46 4FCC0A29 552A06BC C1F8791C 2.HashBucket Returns the bucket number that corresponds to a HashRow in INTEGER data type. EX:- SEL HASHBUCKET(HASHROW(department_number)) FROM DEPT --------------------------------------- 20428 49656 31210 710 37458 2274 21802 48092 3.HashAMP Returns the identification number of the virtual AMP for the primary data row’s HashBucket in INTEGER data type. EX:- SELECT Department_Number, HashAMP(HashBucket(HashRow(Department_Number))) AS PrimaryAMP FROM Department ORDER BY 2; Department_Number PrimaryAMP ----------------- ----------- 501 1 403 3 301 3 402 6 201 9 302 10 600 11 401 13 100 14 3.HashBakAMP Returns the identification number of the virtual AMP for the fallback data row’s HashBucket in INTEGER data type. EX:- SELECT Department_Number, HashBakAMP(HashBucket(HashRow(Department_Number))) AS FbAMP FROM Department ORDER BY 2; Department_Number FbAMP ----------------- ----------- 501 0 403 2 301 2 402 7 201 8 600 10 302 11 401 12 100 15 ___________________________________________________________________________________________________________________________ What are the advantages and dis-advantages of secondary Indexes? Answer: Advantages: 1. A secondary index might be created and dropped dynamically 2.A table may have up to 32 secondary indexes. 3. Secondary index can be created on any column. .Either Unique or Non-Unique 4. It is used as alternate path or Least frequently used cases. ex. defining SI on non indexed column can improve the performance, if it is used in join or filter condition of a given query. 5. Collecting Statistics on SI columns make sure Optimizer choses SI if it is better than doing Full Table Scans Disadvantages 1. Since Sub tables are to be created, there is always an overhead for additional spaces. 2. They require additional I/Os to maintain their sub tables. 3. The Optimizer may, or may not, use a NUSI, depending on its selectivity. 4. If the base table is Fallback, the secondary index sub table is Fallback as well. 5. If statistics are not collected accordingly, then the optimizer would go for Full Table Scan. ___________________________________________________________________________________________________________________________ Where does TD store transient journal? In perm space -> dbc.transientjournal But that special table can grow over dbc's perm limit until the whole system runs out of perm space. ___________________________________________________________________________________________________________________________ How to select first N Records in Teradata? Answers: To select N records in Teradata you can use RANK function. Query syntax would be as follows SELECT BOOK_NAME, BOOK_COUNT, RANK(BOOK_COUNT) A FROM LIBRARY QUALIFY A <= 10; ___________________________________________________________________________________________________________________________ How to view every column and the columns contained in indexes in Teradata? Answers: Following query describes each column in the Teradata RDBMS SELECT * FROM DBC.TVFields; Following query describes columns contained in indexes in the Teradata RDBMS SELECT * FROM DBC.Indexes; ___________________________________________________________________________________________________________________________ How Teradata makes sure that there are no duplicate rows being inserted when it’s a SET table? Answers: Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate. If it’s a duplicate it silently skips it without throwing any error. ___________________________________________________________________________________________________________________________ What is a clique? A clique is a set of Teradata nodes that share a common set of disk arrays which are connected in daisy chain network to each disk array controller. Cliques provide data accessibility if a node fails for any reason, Proms are distributed across all nodes in the system. Large multiple node systems will have clique mechanisms associated with them ___________________________________________________________________________________________________________________________ What is the difference between MultiLoad & Fastload interns of Performance? Answers: If you want to load, empty table then you use the fastload, so it will very useful than the MultiLoad ,because fastload performs the loading of the data in 2phase and its no need a work table for loading the data . So it is faster as well as it follows the below steps to load the data in the table Phase1 - It moves all the records to the entire AMP first without any hashing Phase2 - After giving end loading command, Amp will hashes the record and send it to the appropriate AMPS. MultiLoad: It does the loading in the 5 phases Phase1 - It will get the import file and checks the script Phase2 - It reads the record from the base table and store in the work table Phase3 - In this acquisition phase it locks the table header Phase4 - In the DML operation will done in the tables Phase5 - In this table locks will be released and work tables will be dropped. ___________________________________________________________________________________________________________________________ How does indexing improve query performance? Answers: Indexing is a way to physically reorganize the records to enable some frequently used queries to run faster. The index can be used as a pointer to the large table. It helps to locate the required row quickly and then return it back to the user. Or The frequently used queries need not hit a large table for data. They can get what they want from the index itself. - cover queries. Index comes with the overhead of maintenance. Teradata maintains its index by itself. Each time an insert/update/delete is done on the table the indexes will also need to be updated and maintained. Indexes cannot be accessed directly by users. Only the optimizer has access to the index. ___________________________________________________________________________________________________________________________ What is error table? What is the use of error table? Answers: The Error Table contains information concerning: - Data conversion errors Constraint violations and other error conditions: * Contains rows which failed to be manipulated due to constraint violations or Translation error * Captures rows that contain duplicate Values for UPIs. * It logs errors & exceptions that occurs during the apply phase. * It logs errors that are occurs during the acquisition phase ___________________________________________________________________________________________________________________________ How to find out list of indexes in Teradata? IndexType Description P Nonpartitioned Primary Q Partitioned Primary S Secondary J join index N hash index K primary key U unique constraint V value ordered secondary H hash ordered ALL covering secondary O valued ordered ALL covering secondary I ordering column of a composite secondary index M Multi column statistics D Derived column partition statistics 1 field1 column of a join or hash index 2 field2 column of a join or hash index SELECT databasename, tablename, columnname, indextype, indexnumber, indexname FROM dbc.indices ORDER BY databasename, tablename, indexnumber; ___________________________________________________________________________________________________________________________ When should the statistics be collected? Here are some excellent guidelines on when to collect statistics: · All Non-Unique indices · Non-index join columns · The Primary Index of small tables · Primary Index of a Join Index · Secondary Indices defined on any join index · Join index columns that frequently appear on any additional join index columns that frequently appear in WHERE search conditions · Columns that frequently appear in WHERE search conditions or in the WHERE clause of joins. \___________________________________________________________________________________________________________________________ How to make sure Optimizer chooses NUSI over Full Table Scan (FTS)? Answer: A optimizer would prefer FTS over NUSI, when there are no Statistics defined on NUSI columns. It would prefer scanning for full table instead of going for Subtables in NUSI since optimized does not have information about subtables of NUSI It is always suggested to collect statistics whenever NUSI columns are defined on the table. Verify whether index is being used by checking in Explain plan. ___________________________________________________________________________________________________________________________ What are the basic rules that define how PI is defined in Teradata? Answer: The following rules govern how Primary Indexes in a Teradata Database must be defined as well as how they function: One Primary Index per table. A Primary Index value can be unique or non-unique. The Primary Index value can be NULL. The Primary Index value can be modified. The Primary Index of a populated table cannot be modified. A Primary Index has a limit of 64 columns ___________________________________________________________________________________________________________________________ What are the basic criteria to select Primary Index column for a given table? Answer: A thumb rule of ADV demographics is followed. Access Demographics Identify index candidates that maximize one-AMP operations. Columns most frequently used for access (Value and Join). Distribution Demographics Identify index candidates that optimize parallel processing. Columns that provide good distribution. Volatility Demographics Identify index candidates with low maintenance I/O. ___________________________________________________________________________________________________________________________ What is explain in teradata? The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the steps chosen by the optimizer to execute an SQL statement. It may be used on any valid teradata database with a preface called "EXPLAIN". The following is an example:- EXPLAIN select last_name first_name FROM employees; The EXPLAIN parses the SQL statement but does not execute it. This provides the designer with an "execution strategy". The execution strategy provides what an optimizer does but not why it chooses them. The EXPLAIN facility is used to analyze all joins and complex queries. ___________________________________________________________________________________________________________________________ What are the different return codes(severity errors) in Teradata utilities? There are 3 basic return codes (severity errors) in teradata utilities. 0 - success 4 - Warning 8 - User error 12 - System error 16 - system error Please note that apart from this there are separate error codes for each of the error returned from sql queries. ___________________________________________________________________________________________________________________________ How do you set default date setting in BTEQ? Answer There are two default date setting in BTEQ. They have to be set after logging on to the session set session dateform = ANSIDATE ; /*format is yyyy-mm-dd */ set session dateform = integerdate ; /* format is yy/mm/dd -teradata date format */ ___________________________________________________________________________________________________________________________ What does DROP table command do? What does DROP table table_name command do? It deletes all data in table_name Removes the definition from the data dictionary Removes all explicit access rights on the table ___________________________________________________________________________________________________________________________ Is Like comparison case-sensitive in Teradata? Answer LIKE operator is not case sensitive in Teradata session mode. Consider the following example Select F_name from employee where F_name like '%JO%’; The following query will pick values matching with 'JO' and 'jo' as well, since Teradata is not case-sensitive To overcome this problem, a new function called "CASESPECIFIC" is used in TERADATA as follows Select F_name from employee where F_name (CASESPECIFIC) like '%JO%’; ___________________________________________________________________________________________________________________________ What are advantages of compression on tables? - They take less physical space then uncompressed columns hence reducing space cost - They improve system performance as less data will be retrieved per row fetched , more data is fetched per data block thus increasing data loading speed - They reduce overall I/O ___________________________________________________________________________________________________________________________ How do you submit bteq script (batch mode)? Answer 1. Start the BTEQ , by typing BTEQ 2. Enter the following command .run file = BTEQScript.btq OR 1. Bteq < BTEQScript.btq BTEQScript.btq contains following .logon 127.0.0.1/dbc, dbc; sel top 10 * from dbc.tables; .quit ___________________________________________________________________________________________________________________________ What are the benefits of Permanent Journal? The benefits of Permanent Journal are Permits capture of before images for database rollback. Permits capture of after images for database roll forward. Permits archiving change images during table maintenance. Reduces need for full-table backups. Provides a means of recovering NO FALLBACK tables. Requires additional disk space for change images. Requires user intervention for archive and recovery activity ___________________________________________________________________________________________________________________________ What is the command in BTEQ to check for session settings ? Answer The BTEQ .SHOW CONTROL command displays BTEQ settings. ___________________________________________________________________________________________________________________________ What are the benefits of fallback? The benefits of fallback are Protects your data from hardware (disk) failure. Protects your data from software (node) failure. Automatically recovers with minimum recovery time, after repairs or fixes are complete ___________________________________________________________________________________________________________________________ What’s the difference between TIMESTAMP (0) and TIMESTAMP (6)? TIMESTAMP (0) is CHAR (19) and TIMESTAMP (6) is CHAR (26) Timestamp(0) is YYYY-MM-DDbHH:MI:SS Timestamp(6) is YYYY-MM-DDbHH:MI:SS.ssssss ( milliseconds extra) ___________________________________________________________________________________________________________________________ What is a Dirty-Read or Stale-Read Lock? This occurs when a access lock is applied on the table which is doing a update. May produce erroneous results if performed during table maintenance resulting in Dirty Read or stale read , which might result in inconsistent result set. ___________________________________________________________________________________________________________________________ Differences between BTEQ and Sql assistant (query man)? BTEQ Basic Teradata Query utility SQL front-end Report writing and formatting features Interactive and batch queries Import/Export across all platforms The default number of sessions, upon login, is 1. Teradata Query Manager / Queryman / TeradataSQL Assistant SQL front-end for ODBC compliant databases Historical record of queries including: - Timings - Status - Row counts Random sampling feature Limit amount of data returned Import/Export between database and PC Export to EXCEL or ACCESS ___________________________________________________________________________________________________________________________ How do you execute the given SQL statement repeatedly in BTEQ? Answer: Select top 1* from database.table1; =n Here "=n” is to run the previous sql statement, "n" number of times. ___________________________________________________________________________________________________________________________ What are types of PARTITION PRIMARY INDEX (PPI) in Teradata? Answer: 1. Partition by CASE CREATE TABLE ORDER_Table ( ORD_number integer NOT NULL, customer_number integer NOT NULL, order_date date , order_total integer ) PRIMARY INDEX (customer_number) PARTITION BY case1 ( order_total < 10000 , order_total < 20000 , order_total < 30000, NO CASE OR UNKNOWN ) ; 2. Partition by Range - example using date range CREATE TABLE ORDER_Table ( ORD_number integer NOT NULL, customer_number integer NOT NULL, order_date date , order_total integer ) PRIMARY INDEX (customer_number) PARTITION BY range1 ( Order_date BETWEEN date '2010-01-01' AND date '2010-12-01' EACH interval '1' month , NO RANGE OR UNKNOWN); P.S:If we use NO RANGE or NO CASE - then all values not in this range will be in a single partition. If we specify UNKNOWN, then all null values will be placed in this partition ___________________________________________________________________________________________________________________________ Can we define PARTITION PRIMARY INDEX (PPI) on a Primary Index column in Table? Explain Why? Answer: PPI cannot be defined on PI column in Table. Since PI is used to locate an AMP and store the data based on HASH value (ROW ID ) it cannot be used as PPI column. In case of PPI , The data stored in AMP's are Partitioned based on PPI column after they are row hashed (ROW KEY = ROW ID +PPI value ) P.S: If you want to create UPI on table, then PPI column can be added as part of PI . ___________________________________________________________________________________________________________________________ How to skip 1st record while using Bteq IMPORT? Answer: By using SKIP=1 ; , we can skip first record. .import infile=<filename>, skip=1; ___________________________________________________________________________________________________________________________ What is TENACITY? What is its default value? Answer TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon. The default for Fast Load is “no tenacity”, meaning that it will not retry at all. If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours. ___________________________________________________________________________________________________________________________ What does SLEEP function does in Fast load? Answer The SLEEP command specifies the amount minutes to wait before retrying to logon and establish all sessions. Sleep command can be used with all load utilities not only fastload. This situation can occur if all of the loader slots are used or if the number of requested sessions is not available. The default value is 6 minutes. If tenacity was set to 2 hours and sleep 10 minutes, Sleep command will try to logon for every 10 minutes up to 2 hours duration. ___________________________________________________________________________________________________________________________ What is Cross Join? Answer: It is a Teradata specified Join, which is used as equivalent to product join. There is no “On” clause in case of CROSS join SELECT EMP.ename , DPT.Dname FROM employee EMP CROSS JOIN Department DPT WHERE EMp.deptno = DPT.depto ; ___________________________________________________________________________________________________________________________ How many types of Index are present in teradata? Answer: There are 5 different indices present in Teradata 1. Primary Index a.Unique primary index b. non Unique primary index 2. Secondary Index a. Unique Secondary index b. non Unique Secondary index 3. Partitioned Primary Index a. Case partition (ex. age, salary...) b. range partition ( ex. date) 4. Join index a. Single table join index b. multiple table join index c. Sparse Join index ( constraint applied on join index in where clause) 5. Hash index ___________________________________________________________________________________________________________________________ Difference between Stored Procedure and Macro? Answer: SP: It does not return rows to the user. It has to use cursors to fetch multiple rows It used Inout/Out to send values to user It Contains comprehensive SPL It is stored in DATABASE or USER PERM A stored procedure also provides output/Input capabilities Macros: It returns set of rows to the user. It is stored in DBC SPOOL space A macro that allows only input values Hi Vinay. It is Possible to return the ROWS to user From Procedure. Syntax is:-- CREATE PROCEDURE PROCEDURE_EMP() DYNAMIC RESULT SETS 1 BEGIN DECLARE cur CURSOR WITH RETURN only FOR SELECT * FROM EMP; OPEN cur; END; ___________________________________________________________________________________________________________________________ What is a Sparse Index? Answer: Sparse Join Indexes are a type of Join Index which contains a WHERE clause that reduces the number of rows which would otherwise be included in the index. All types of join indexes, including single table, multitable, simple or aggregate can be sparse. How to handle nulls in Teradata??? How many columns can be there in a table??? How to handle nulls in Teradata??? How many columns can be there in a table??? Answer 1. Use zeroifnull, nullifzero in select and NULL in insert directly. 2. 256 columns max per table. ___________________________________________________________________________________________________________________________ How to find average sal with out using avg function???? Answer Without using "avg" we can find the avg salary by using sum (sal)/count (sal); sel sum(sal)/count(sal) as avgsal from tablename ___________________________________________________________________________________________________________________________ What is difference B/w User and database in Teradata? Answer: - User is a database with password but database cannot have password - Both can contain Tables , views and macros - Both users and databases may or may not hold privileges - Only users can login, establish a session with Teradata database and they can submit requests ___________________________________________________________________________________________________________________________ How do you create materialized view in Teradata? Answer: There is no such thing as a "materialized view" in Teradata. The equivalent in Teradata would be a join index (or aggregate index) with a corresponding view put on top of it. The command to create one is "CREATE JOIN INDEX...(lots of options)". Join indices are maintained when the data is maintained. They cannot be accessed directly, but are invoked when the optimizer determines it is beneficial. Even though you cannot invoke a join index directly, you create a view that looks like the join index and in that way, it would be similar to having direct access. However, this approach does not guarantee that the join index will be used. ___________________________________________________________________________________________________________________________ What are Differences between Teradata and ANSI Session modes in Teradata? Answer: TERADATA MODE ------------- 1. Comparison is not Case sensitive 2. Create table are default to SET tables 3. Each transaction is committed implicitly 4. Supports all Teradata commands 5. It follows BTET (Begin and End Transaction) Mode ANSI MODE ------------- 1. Comparison is CASE sensitive 2. Create table are default to MULTISET tables 3. Each transaction has to be committed explicitly 4. Does not support all Teradata commands 5. It does not follow BTET Mode ___________________________________________________________________________________________________________________________ What are the scenarios in which Full Table Scans occurs? 1. The where clause in SELECT statement does not use either primary index or secondary index 2. SQL Statement which uses a partial value (like or not like), in the WHERE statement. 3. SQL Statement which does not contain where clause. 4. SQL statement using range in where clause. Ex. (col1 > 40 or col1 < =10000) ___________________________________________________________________________________________________________________________ How to identify PPI columns? Answer: Select databasename , tablename , columnposition ,columnname from dbc.indices where indextype ='Q' order by 1 ,2,3 ; ___________________________________________________________________________________________________________________________ How to skip the header row in the fastload script RECORD 2; /* this skips first record in the source file */ DEFINE ... ___________________________________________________________________________________________________________________________ Explain types of re-distribution of data happening for joining of columns from two tables Answer: Case 1 - P.I = P.I joins Case 2 - P.I = N.U.P.I joins Case 3 - N.U.P.I = N.U.P.I joins Case1 - there is no redistribution of data over amp's. Since amp local joins happen as data are present in same AMP and need not be re-distributed. These types of joins on unique primary index are very fast. Case2 - data from second table will be re-distributed on all amps since joins are happening on PI vs. NUPI column. Ideal scenario is when small table is redistributed to be joined with large table records on same amp case3 - data from both the tables are redistributed on all AMPs. This is one of the longest processing queries , Care should be taken to see that stats are collected on these columns Can you load multiple data files for same target table using Fastload? Can you load multiple data files for same target table using Fastload? Answer: Yes, we can Load a table using multiple datafiles in Fastload. Before giving "end loading" statement user can define file path and use insert sql for multiple source files and give "end loading" statement at the end ___________________________________________________________________________________________________________________________ Why does varchar occupy 2 extra bytes? Answer The two bytes are for the number of bytes for the binary length of the field. It stores the exact no of characters stored in varchar ___________________________________________________________________________________________________________________________ How many types of Skew exist? Answer If you utilized unequally TD resources (CPU,AMP,IO,Disk and etc) this is called skew exists. Major are 3 types of skews (CPU skew, AMP/Data skew, IO Skew). -Data skew? When data is not distributed equally on all the AMPs. -Cpu skew? Who is taking/consuming more CPU called cpu skew. -IO skew? Who perform more IO Operation? Resulting in IO Skew ___________________________________________________________________________________________________________________________ Why Fload doesn’t support multiset table? Answer Fload does not support Multiset table because of following reason. Say, the fastload job fails. Till the fastload failed, some number of rows was sent to the AMP's. Now if you restart FLOAD, it would start loading record from the last checkpoint and some of the consecutive rows are sent for the second time. These will be caught as duplicate rows are found after sorting of data. This restart logic is the reason that Fastload will not load duplicate rows into a MULTISET table. It assumes they are duplicates because of this logic. Fastload support Multiset table but does not support the duplicate rows. Multiset tables are tables that allow duplicate rows. When Fastload finds the duplicate rows it discards it. Fast Load can load data into multiset table but will not load the duplicate rows ___________________________________________________________________________________________________________________________ What is Global Temporary table? What is the use of this Global Temporary table? Can we take collect stats on Derived Tables and Volatile tables and Temporary tables? What is Global Temporary table? What is the use of this Global Temporary table? Answer 1. No for volatile and derived tables and yes for global tables. 2. Global tables are temp tables like volatile tables but unlike volatile tables, their definitions are retained in dd. 3. It is used whenever there is a need for a temporary table with same table definition for all users. ___________________________________________________________________________________________________________________________ What is the default join strategy in Teradata??? Answer # 1 The Teradata Cost based optimizer will decide the join strategy based on the optimum path. The common strategies followed are from Merge, Hash & Nested Join Answer # 2 Three strategies followed by optimizer are: 1. Duplication of rows of one table to every amp --> This one is opted by optimizer when the non-PI column is on a small table. 2. Redistribution of the non PI rows of the table to the amp containing the matching PI row amp. --> This one is opted when the tables are large. The non PI column is in one table is redistributed to the amp containing the matching PI. 3. Redistribute both the rows of the table by hash values. --> This is chosen when the join is on a column that is not the PI in either table. This will require the most spool space. ___________________________________________________________________________________________________________________________ What do High confidence, Low confidence and No confidence mean in EXPLAIN plan? Answer Explain gives the execution strategy means what are the different steps that the query will go through. HIGH CONFIDENCE: Statistics are collected. LOW CONFIDENCE: Statistics are not collected. But the where condition is having the condition on indexed column. Then estimations can be based on sampling. NO CONFIDENCE: Statistics are not collected and the condition is on non indexed column. ___________________________________________________________________________________________________________________________ How to Skip or Get first and Last Record from Flat File through MultiLoad? Answer In .IMPORT command in Mload we have a option to give record no. from which processing should begin. i.e. ‘FROM m’ ‘m’ is a logical record number, as an integer, of the record in the identified data source where processing is to begin. You can mention ’m’ as 2 and processing will start from second record. THRU k and FOR n are two options in the same Mload command, functions same towards the end of the processing. Adding to the above, if from n"start record" and for n "stop record" are not mentioned, mload considers records from start till the end of the file. ___________________________________________________________________________________________________________________________ Which is faster – MultiLoad delete or Delete command? Answer MultiLoad delete is faster then normal Delete command, since the deletion happens in data blocks of 64Kbytes, where as delete command deletes data row by row. Transient journal maintains entries only for Delete command since Teradata utilities doesn’t support Transient journal loading. ___________________________________________________________________________________________________________________________ What is Teradata Virtual storage? Answer This concept is introduced in TD12. It does the following tasks - maintains information on frequency of data access - tracks data storage task on physical media - migrating frequently used data to fast disks and less frequently used data to slower disks - allocating cyclinders from storage to individual AMPs ___________________________________________________________________________________________________________________________ how to start / stop a database in windows? Answer 1. logon to CMD 2. check for state pdestate -d 3. run the following command to start "net start recond" 4. check for status pdestate -d 5. to STOP the database - Trusted Parallel Application or TPA tpareset -x comment The -x option stops Teradata without stopping the OS. ___________________________________________________________________________________________________________________________ What is a role? Answer A role is a set of access rights which can be assigned to the users. They indirectly help in performance by reducing the number of rows entered in DBC.accessrights ___________________________________________________________________________________________________________________________ what is a profile? Answer A profile contains set of user parameters like accounts, default database, spool space, and temporary space for a group of users To assign the profile to a user, use the AS PROFILE modified in a CREATE USER or MODIFY USER statement: MODIFY USER username AS PROFILE=profilename ; To remove a profile from a member but retain the profile itself: MODIFY USER username AS PROFILE=NULL ; ___________________________________________________________________________________________________________________________ Why are AMPs and PEs called as vprocs ? Answer AMPs and PEs are implemented as “virtual processors - vprocs”. They run under the control of PDE and their number is software configurable. AMPs are associated with “virtual disks – vdisks” which are associated with logical units (LUNs) within a disk array Answer2: Vprocs:Virtual process From PE to AMP (This is the network root via MSP(message passing layer),The processing data will store in Disks(These are Physical disks),Each Amp have too many P.disks,to migrate these P.disks The Bynet Network maintains Virtual disks.These V.disks will responsible for data migration.hence they are called as Virtual Process(VPROCS). ___________________________________________________________________________________________________________________________ What is residual condition in explain plan ? Answer It is a condition which help u to reduce the number used for join condition. Residual condition does not help in locating a row ___________________________________________________________________________________________________________________________ How to check if given object is a database or user ? Answer To check whether the given object is user or database , we can use following query sel * from dbc.databases where dbkind ='U' or dbkind='D'; ___________________________________________________________________________________________________________________________ What are set tables and multiset tables in Teradata?Explain with an appropriate example? Answers: 1) Set tables cannot accept duplicate at row level not Index or key level. Example of rows for set table: R1 c2 c3 c4 ..cn 1 2 3 4 ... 9 Accepted 1 2 3 4 ... 9 Duplicate is Rejected 2 1 2 4 ... 9 3 2 4 4 ... 9 4 3 4 4 ... 9 2) Multi set Tables can accept duplicate at row level not Index or key level.Exmaple of rows for multi set table: R1 c2 c3 c4 .. cn 1 2 3 4 ... 9 1 2 3 4 ... 9 Duplicate is Accepted 2 1 2 4 ...9 3 2 4 4 ...9 3 2 4 4 ...9 Duplicate is Accepted ___________________________________________________________________________________________________________________________ In a table can we use primary key in one column and in another column both unique and not null constrains.if yes how? Answers: Yes, you can have a column for Primary key and have another column which will have no duplicates or null.e.g.A Salary Table will have employee ID as primary key. The table also contains TAX-ID which can not be null or duplicate ___________________________________________________________________________________________________________________________ What is use of compress in teradata?Explain? Answers: Compression is used to Minimize the table size, for example when the table size is increasing anonymously We can use Compression to reduce the size of the table Conditions:1.Compression can be declared at the time of table creation2.We can compress up to 256 column values(not columns) 3.We can't compress variable length fields (vartext,varchar..) Explanation for 2 condition:create table tab1(::Order_type char(25) compress ('air','sea','road'):)in the above example order type have 3 fields, one should be selected by the user, so one of the field will repeat for every order, like these column values we can use compress statement because these are repeating for entire table, like these column values TD supports 256 col generally NCR people will provides ready made scripts for these type of compressions However, we can store only one value per column and not 3(air, sea, road). The compressed value is stored in column header and is to be used as default of that column unless a value is present. e.g. Dept of Taxes in Washington has a database that tracks all people working in Washington. Around 99.9% of the tax payers would have Washington as their state code on their address. Instead of storing “Washington” in millions of records the compress will store the value “Washington” in the table header. Now, a value can be assumed in a row as a default unless another value exists inside the column ___________________________________________________________________________________________________________________________ What is the process to restart the multiload if it fails? Answers: MULTILOAD will creates 2 error tables, 1 work table When MULTILOAD fails We have to unlock the Main Table, here error tables and work tables are not locked like FLOAD. To Unlock the Main Table in case of acquisation Phase : Mload RELEASE MLOAD ; To release lock in application phase failure : RELEASE MLOAD .IN APPLY; ___________________________________________________________________________________________________________________________ Can we load a Multi set table using MLOAD? Answers: We can Load SET, MULTISET tables using Mload, But here when loading into MULTISET table using MLOAD duplicate rows will not be rejected, we have to take care of them before loading.But in case of Fload when we are loading into MULTISET duplicate rows are automatically rejected, FLOAD will not load duplicate rows weather table is SET or MULTISET ___________________________________________________________________________________________________________________________ Can I use “drop” statement in the utility “fload”? Answers: YES,But you have to declare it out of the FLOAD Block it means it should not come between .begin loading,.end loading FLOAD also supports DELETE,CREATE,DROP statements which we have to declare out of FLOAD blocking the FLOAD Block we can give only INSERT ___________________________________________________________________________________________________________________________ Is it possible that there r two primary key will be in a same table? Answers: Primary key ========== 1. A table should have only one primary key 2. More than one column can consist of a primary key – upto 64 columns 3. Can not be NULL values (missing values) 4. Should be unique values (no duplicate value) Foreign key ========== 1. A table may have zero or more than that up-to 32 keys 2. More than one column can consist a primary key – up to 64 columns 3. Can have NULL values 4. Can have duplicate values 5. There should be a corresponding primary key in the parent table to enforce referential integrity for a foreign key ___________________________________________________________________________________________________________________________ Teradata performance tuning and optimization Answers: 1. collecting statistics 2. Explain Statements 3. Avoid Product Joins when possible 4. select appropriate primary index to avoid skewness in storage 5. Avoid Redistribution when possible 6. Use sub-selects instead of big "IN" lists 7. Use derived tables 8. Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed) 9. Use Compression on large tables ___________________________________________________________________________________________________________________________ What is the difference between Global temporary tables and Volatile temporary tables? Answers: Global Temporary tables (GTT) – 1. When they are created, its definition goes into Data Dictionary. 2. When materialized data goes in temp space. 3. That's why, data is active upto the session ends, and definition will remain there up-to its not dropped using Drop table statement.If dropped from some other session then its should be Drop table all; 4. you can collect stats on GTT. Volatile Temporary tables (VTT) - 1. Table Definition is stored in System cache 2. Data is stored in spool space. 3. thats why, data and table definition both are active only upto session ends. 4. No collect stats for VTT.If you are using volatile table, you can not put the default values on column level ( while creating table ) ___________________________________________________________________________________________________________________________ Join strategies? Join Strategies There are 2 tables, table A with 10 million records, table B has 100 million records, now we are joining both tables, when we seen Explain Plan the plan showing TD will took the table A and it will redistributes itNow the Question is: By that plan is the optimizer is correct job or not ? Justify Ans2. From the same above example now the optimizer is taking Table B (100 million records) and it is distributing it, Now is the optimizer is doing best? and How you avoid this situation Answers: Teradata is smart enough to decide when to redistribute and when to copy. It compares the tables. Are they comparable? or one is big as compared to the other?Based on simple logic it decides whether to distribute the smaller table on all the AMPs or to copy. what I mean is the small table is copied into all the AMPs in the SPOOL space.Remember all always the Join's has to take place on the AMPs SPOOL Space.By redistributing it is making sure that the 100 million rows table gets the feeling that it is making AMP local JOIN. Remember the basic thing what ever Teradata does. It does keeping in consideration for Space and Performance and not to forget the Efficiency. My simple formula: If the table is small redistribute them to all the AMPs to have the AMP local Join. Always JOINs are made AMP local if it cannot then you have the high chance of running out of SPOOL space. ___________________________________________________________________________________________________________________________ How can you track Login Parameters of users in Teradata? Answers: You can view all these parameters in this Data Dictionary Table DBC.LOGONOFF SELECT LOGDATE, LOGTIME, USERNAME, EVENT FROM DBC.LOGONOFF; ___________________________________________________________________________________________________________________________ How can you find the Teradata Release and Version information from Data Dictionary Table? Answers: To find Release and Version information you can query this Data Dictionary table DBC.DBCINFO SELECT * FROM DBC.DBCINFO; ___________________________________________________________________________________________________________________________ How can you find the Table Space Size of your table across all AMPs? Answers: You can find the Table Space Size of your table from this Data Dictionary Table DBC.TABLESIZE SELECT DATABASENAME, TABLENAME, SUM(CURRENTPERM) FROM DBC.TABLESIZE WHERE DATABASENAME = ‘’ AND TABLENAME = ‘’ GROUP BY DATABASENAME , TABLENAME; ___________________________________________________________________________________________________________________________ How can you determine I/O and CPU usage at a user level in Teradata? Answers: You can find out I/O and CPU Usage from this Data Dictionary Table DBC.AMPUSAGE; SELECT ACCOUNTNAME, USERNAME, SUM(CPUTIME) AS CPU, SUM(DISKIO) AS DISKIO FROM DBC.AMPUSAGE GROUP BY 1,2 ORDER BY 3 DESC; ___________________________________________________________________________________________________________________________ Explain TPUMP (Teradata Parallel Data Pump) Utility in Teradata? Answers: * TPUMP allows near real time updates from Transactional Systems into the Data Warehouse. * It can perform Insert, Update and Delete operations or a combination from the same source. * It can be used as an alternative to MLOAD for low volume batch maintenance of large databases. * TPUMP allows target tables to have Secondary Indexes, Join Indexes, Hash Indexes, Referential Integrity, Populated or Empty Table, Multiset or Set Table or Triggers defined on the Tables. * TPUMP can have many sessions as it doesn’t have session limit. * TPUMP uses row hash locks thus allowing concurrent updates on the same table. ___________________________________________________________________________________________________________________________ Explain Ferret Utility in Teradata? Ferret (File Reconfiguration tool) is an utility which is used to display and set Disk Space Utilization parameters within Teradata RDBMS. When you select the Ferret Utility parameters, it dynamically reconfigures the data on disks. We can run this utility through Teradata Manager; to start the Ferret Utility type (START FERRET) in the database window. Answers: Following commands can be used within Ferret Utility: 1. SHOWSPACE – Well this command reports you the amount of Disk Cylinder Space is in use and the amount of Disk Cylinder Space is available in the system. This will give you an information about Permanent Space cylinders, Spool Space cylinders, Temporary Space cylinders, Journaling cylinders, Bad cylinders and Free cylinders. For each of these 5 things it will present you 3 parameters i.e. Average Utilization per cylinder, % of total avaliable cylinders and number of cylinders. 2. SHOWBLOCKS – This command will help you in identifying the Data Block size and the number of Rows per data block. This command displays the Disk Space information for a defined range of Data Blocks and Cylinders. ___________________________________________________________________________________________________________________________ Can you recover the password of a user in Teradata? Answers: No, you can’t recover the password of a user in Teradata. Passwords are stored in this data dictionary table (DBC.DBASE) using a one-way encryption method. You can view the encrypted passwords using the following query SELECT * FROM DBC.DBASE; ___________________________________________________________________________________________________________________________ What is Reconfiguration Utility in Teradata and What it is used for? Answers: * When we feed Primary Index value to Hashing Algorithm then it gives us Row Hash(32 bit number) value which is used to make entries into Hash Maps. * Hash Maps are the mechansim for determining which AMP will be getting that row. * Each Hash Map is an array of 65,536 entries and its size is close to 128KB. When Teradata is installed on a system then there are some scrpits which we need to execute i.e. DIP Scripts. So it creates a Hash Maps of 65,536 entries for the current configuration. But what if you want to add some more AMPs into your system? Reconfiguration (Reconfig) is a technique for changing the configuration (i.e. changing the number of AMPs in a system) and is controlled by the Reconfiguration Hash Maps. System builds Reconfiguration Hash Maps by reassigning hash map entries to reflect new configuration of system. Lets understand this concept with the help of an example; suppose you have a 4 AMPs system which holds 65,536 entries. Each AMP is responsible for holding (65,536/4=16,384) 16,384 entries. Now you have added 2 more AMPs in your current configuration so you need to reconfigure your system. Now each AMP would be responsible for holding (65,536/6=10922) 10,922 entries. ___________________________________________________________________________________________________________________________ What is the difference between Sub-Query & Co-Related Sub-Query? Answers: When queries are written in a nested manner then it is termed as a sub-query. A Sub-Query get executed once for the parent statement whereas Co-Related Sub-Query get executed once for each row of the parent query. Select Empname, Deptno, Salary from Employee Emp where Salary = (Select Max(Salary) from Employee where Deptno = Emp.Deptno) order by Deptno ___________________________________________________________________________________________________________________________ What is the difference between Access Logging and Query Logging in Teradata? Answers: 1. Access Logging is concerned with security (i.e. who’s is doing what). In access logging you ask the database to log who’s doing what on a given object. The information stored is based on the object not the SQL fired or the user who fired it. 2. Query Logging (DBQL) is used for debugging (i.e. what’s happening around ?). Incase of DBQL database keep on tracking various parameters i.e. the SQLs, Resource, Spool Usage, Steps and other things which help you understand what’s going on, the information is fruitful to debug a problem. Further DBQL is enabled on a User id rather than an object like say Table or so. ___________________________________________________________________________________________________________________________ What is FILLER command in Teradata? Answers: While running Fastload or Multiload if you don’t want to load a particular field from the datafile to the target table then use the FILLER command to achieve this. Syntax for FILLER command would be as following: .LAYOUT FILE_PRODUCT; /* It is input file layout name */ .FIELD Prod_No * char(11); /* To load data into Prod_No */ .FIELD Prod_Name * char(11); /* To load data into Prod_Name */ .FIELD Location * char(11); /* To load data into Location */ .FILLER Prod_Chars * char(20); /* To skip the value for the next 5 locations */ ___________________________________________________________________________________________________________________________ What are TPUMP Utility Limitations? Answers: Following are the limitations of Teradata TPUMP Utility: * Use of SELECT statement is not allowed. * Concatenation of Data Files is not supported. * Exponential & Aggregate Operators are not allowed. * Arithmatic functions are not supported. ___________________________________________________________________________________________________________________________ What are the MultiLoad Utility limitations? Answers: MultiLoad is a very powerful utility; it has following limitations: * MultiLoad Utility doesn’t support SELECT statement. * Concatenation of multiple input data files is not allowed. * MultiLoad doesn’t support Arithmatic Functions i.e. ABS, LOG etc. in Mload Script. * MultiLoad doesn’t support Exponentiation and Aggregator Operators i.e. AVG, SUM etc. in Mload Script. * MultiLoad doesn’t support USIs (Unique Secondary Indexes), Refrential Integrity, Join Indexes, Hash Indexes and Triggers. * Import task require use of PI (Primary Index). ___________________________________________________________________________________________________________________________ What are the functions of a Teradata DBA? Answers: Following are the different functions which a DBA can perform: 1. User Management – Creation and managing Users, Databases, Roles, Profiles and Accounts. 2. Space Allocation – Assigning Permanent Space, Spool Space and Temporary Space. 3. Access of Database Objects – Granting and Revoking Access Rights on different database objects. 4. Security Control – Handling logon and logoff rules for Users. 5. System Maintenance – Specification of system defaults, restart etc. 6. System Performance – Use of Performance Monitor(PMON), Priority Scheduler and Job Scheduling. 7. Resource Monitoring – Database Query Log(DBQL) and Access Logging. 8. Data Archives, Restores and Recovery – ARC Utility and Permanent Journals. ___________________________________________________________________________________________________________________________ What are the 5 phases in a MultiLoad Utility? Answers: * Preliminary Phase – Basic Setup * DML Phase – Get DML steps down on AMPs * Acquisition Phase – Send the input data to the AMPs and sort it * Application Phase – Apply the input data to the appropriate Target Tables * End Phase – Basic Cleanup ___________________________________________________________________________________________________________________________ How to eliminate Product Joins in a Teradata SQL query? Answers: 1. Ensure statistics are collected on join columns and this is especially important if the columns you are joining on are not unique. 2. Make sure you are referencing the correct alias. 3. Also, if you have an alias, you must always reference it instead of a fully qualified tablename. 4. Sometimes product joins happen for a good reason. Joining a small table (100 rows) to a large table (1 million rows) a product join does make sense. ___________________________________________________________________________________________________________________________ How does Hashing happens in Teradata? Answers: * Hashing is the mechanism through which data is distributed and retrieved to/from AMPs. * Primary Index (PI) value of a row is the input to the Hashing Algorithm. * Row Hash (32-bit number) value is the output from this Algorithm. * Table Id + Row Hash is used to locate Cylinder and Data block. * Same Primary Index value and data type will always produce same hash value. * Rows with the same hash value will go to the same AMP. So data distribution depends directly on the Row Hash uniqueness; be careful while Choosing Indexes in Teradata.