Most commonly asked Data Warehouse & Informatica ETL Interview Questions with Answer: Set 3

Most commonly asked Data Warehouse & Informatica ETL Interview Questions with Answer: Set 3

41.Explain the 3-layer architecture of an ETL cycle.
Staging Layer − The staging layer is used to store the data extracted from different source data systems.
Data Integration Layer − The integration layer transforms the data from the staging layer and moves the data to a database, where the data is arranged into hierarchical groups, often called dimensions, and into facts and aggregate facts. 
Access Layer − The access layer is used by end-users to retrieve the data for analytical reporting.


42.What is the difference between and ETL and BI tools?
ETL means extraction transformation and load. It is a process of extracting the data, transforming it and then loading it into datasource like mysql.
Business intelligence is the process of transforming data into visualizations. ETL can also be a part of the BI process. Business intelligence means producing results out of your data.

43.Why do we need a staging area in an ETL process?
We need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.

44.What is a Dimension table and how is it different from a Fact table?
Fact table is a measurement of a subject.
Dimension table is from what point of view do you want to measure it (usually when we summarize / aggregate it).

45.What is an Aggregate function? Name a few common aggregate functions.
In database management an aggregate function is a function where the values of multiple rows are grouped together to form a single value of more significant meaning or measurement such as a set, a bag or a list.
Average()Count(), Maximum(), nanmean() (mean ignoring “NULL" values), Median(), Minimum(), Mode(), Sum().

46.What is the difference between Minus and Intersect in SQL? 
MINUS operator when used between two tables, gives us all the rows from the first table except the rows which are present in the second table.
INTERSECT operator returns us only the matching or common rows between two resulting sets.

47.What do you understand by Active and Passive Transformations?
An active transformation can change the number of rows as output after a transformation, while a passive transformation does not change the number of rows and passes through the same number of rows that was given to it as input.

48.What is the difference between round-robin partitioning and Hash partitioning?
Round-Robin Partitioning:
- Data is distributed evenly by Informatica among all partitions.
- This partitioning is used where the number of rows to process in each partition are approximately same.
Hash Portioning:
- Informatica server applies a hash function for the purpose of partitioning keys to group data among partitions. 
- It is used where ensuring the processes groups of rows with the same partitioning key in the same partition, need to be ensured.

49.What is Partitioning? Explain different types of partitioning.
Partitioning is to sub divide the transactions to improve performance and increasing the number of partitions enables Informatica Server for creation of multiple connections to various sources.
Types of Partitioning are Round-Robin Partitioning and Hash Portioning.

50.Explain the terms − mapplet, session, mapping, workflow − in an ETL process?
A mapplet creates or configures a set of transformations.
A session is a set of instructions to move data from sources to targets.
A mapping represents dataflow from sources to targets.
A workflow is a set of instruction sthat tell the Informatica server how to execute the tasks.

51.What is lookup transformation and when is it used?
Lookup Transformations can access data from relational tables that are not sources in mapping.
To Get a related value. To update slowly changing dimension tables.

52.What is a surrogate key in a database?
A surrogate key is any column or set of columns that can be declared as the primary key instead of a "real" or natural key.

53.What is the difference between surrogate key and primary key?
A surrogate key is any column or set of columns that can be declared as the primary key instead of a "real" or natural key.
A primary key is a special constraint on a column or set of columns. A primary key constraint ensures that the column(s) so designated have no NULL values, and that every value is unique.

54.Name the three approaches that can be followed for system integration.
The DIY Approach, The Middleware Approach and The Data Platform Approach.

55.What do you understand by a cosmetic bug or user interface bug in ETL testing?
Cosmetic defects are the bug which does not affect further processing of the application and testing. The priority of the cosmetic defects are categorized as "low", while assigning priority.
In general cosmetic errors are the spelling mistakes, tab sequence etc.,

56.What do you call the testing bug that comes while performing threshold validation testing?
It is called Boundary Value Analysis related bug.

57.What do you understand by fact-less fact table?
A fact-less fact table is a fact table that does not have any measures. It is essentially an intersection of dimensions. There are two types of fact-less tables: One is for capturing an event, and the other is for describing conditions.

58.User A is already logged into the application and User B is trying to login, but the system is not allowing. Which type of bug is it?
a) Race Condition bug
b) Calculation bug
c) Hardware bug
d) Load Condition bug

59.Which testing type is used to check the data type and length of attributes in ETL transformation?
a)Production Validation Testing
b)Data Accuracy Testing
c)Metadata Testing
d)Data Transformation testing

60.Which bug type in ETL testing doesn’t allow you to enter valid values?
a)Load Condition bugs
b)Calculation bugs
c)Race condition bug
d)Input/ Output bug

61.How you can extract SAP data using Informatica?
We can extract SAP data by using informatica with the powerConnect option.

62. Explain what are Schema Objects?
Schema objects are logical data storage structures. Schema objects do not have a one-to-one correspondence to physical files on disk that store their information.

63. Explain what is data source view?
A data source view: Contains the metadata that represents selected objects from one or more underlying data sources, or the metadata that will be used to generate an underlying relational data store if you are following the top-down approach to schema generation.

64. What are differences between connected and unconnected look up in ETL?
Connected Lookup : a connected lookup is a part of the data flow and is connected to other transformations. 
Unconnected Look Up Transformation : Unconnected lookup is generally used when the result of a query is a single column.


Image result for informatica transformations questions

5 comments:

  1. Thanks for sharing much needed question answers. Well explained.

    ReplyDelete
  2. This is a really good post.

    ReplyDelete
  3. Great list of interview questions related to Data Warehouse & Informatica ETL, thanks for sharing. Get in touch for data center design services.

    ReplyDelete
  4. Thanks for sharing valuable information and very nice article. Keep posting.

    etl testing training
    etl testing online

    ReplyDelete