1. What is Python?
Python is a high-level, interpreted programming language known for its simplicity and readability. It was created by Guido van Rossum and first released in 1991. Python supports multiple programming paradigms, including procedural, object-oriented, and functional programming. It is widely used in web development, data analysis, artificial intelligence, scientific computing, and automation tasks due to its extensive standard library and large community support.
2. What are the key features of Python?
Python has several key features that make it popular among developers:
- Simple and Readable Syntax: Python's code is easy to read and write, resembling pseudo-code.
- Interpreted Language: Code is executed line by line, which makes debugging easier.
- Dynamically Typed: No need to declare variable types explicitly.
- Extensive Standard Library: Provides modules and packages for various tasks, reducing the need for external libraries.
- Cross-Platform: Runs on multiple operating systems like Windows, macOS, and Linux.
- Object-Oriented: Supports classes, inheritance, and other OOP concepts.
- Community Support: Large ecosystem with frameworks like Django, Flask, and libraries like NumPy, Pandas.
- Extensible and Embeddable: Can integrate with other languages like C/C++.
3. What is PEP 8 and why is it important?
PEP 8 is the Python Enhancement Proposal that provides style guidelines for writing Python code. It covers naming conventions, indentation, line length, imports, and more to ensure code consistency and readability.
It is important because it promotes a uniform coding style across projects, making code easier to read, maintain, and collaborate on. Following PEP 8 reduces errors and improves code quality in team environments.
Tools like pylint or black can enforce PEP 8 standards automatically.
4. How is Python executed?
Python code is executed using an interpreter. When you run a Python script, the interpreter reads the source code, compiles it into bytecode (.pyc files), and then executes the bytecode in the Python Virtual Machine (PVM). This process is line-by-line, allowing for immediate error detection.
For example, running a script:
# example.py
print("Hello, World!")
Execute with: python example.py
5. What is a dynamically typed language?
A dynamically typed language is one where variable types are determined at runtime rather than at compile time. In Python, you don't need to specify the data type when declaring a variable; the interpreter infers it based on the assigned value.
This allows for flexibility but can lead to runtime errors if types are mismatched.
x = 10 # x is int
x = "Hello" # Now x is str, no error
6. What are Built-in data types in Python?
Python's built-in data types include:
- Numeric: int, float, complex
- Sequence: list, tuple, range
- Text: str
- Set: set, frozenset
- Mapping: dict
- Boolean: bool
- NoneType: None
These types are fundamental and don't require importing modules.
7. What is the difference between a Mutable datatype and an Immutable data type?
Mutable data types can be changed after creation, meaning their contents or values can be modified in place without creating a new object. Examples: list, dict, set.
Immutable data types cannot be changed after creation; any modification creates a new object. Examples: int, float, str, tuple.
Mutability affects performance and usage in functions or as dictionary keys (only immutables can be keys).
lst = [1, 2] # Mutable
lst.append(3) # Changes lst
tup = (1, 2) # Immutable
# tup.append(3) # Error
8. What are lists and tuples? What is the key difference between the two?
Lists are ordered, mutable collections of items, defined with square brackets []. They can hold mixed data types and allow duplicates.
Tuples are ordered, immutable collections, defined with parentheses (). They are faster and used for fixed data.
Key difference: Lists are mutable (can be modified), while tuples are immutable (cannot be changed after creation).
my_list = [1, 2, 3] # List
my_tuple = (1, 2, 3) # Tuple
9. What is a dictionary in Python?
A dictionary is an unordered, mutable collection of key-value pairs, defined with curly braces {}. Keys must be immutable and unique, while values can be any type.
Dictionaries are efficient for lookups and are commonly used for mapping data.
my_dict = {"name": "Alice", "age": 30} # Dictionary
print(my_dict["name"]) # Output: Alice
10. What is a set in Python?
A set is an unordered, mutable collection of unique elements, defined with curly braces {} or the set() function. Sets automatically remove duplicates and support mathematical operations like union, intersection.
They are useful for membership testing and eliminating duplicates.
my_set = {1, 2, 3, 2} # Set, duplicates removed
print(my_set) # Output: {1, 2, 3}
11. What is List Comprehension? Give an Example.
List comprehension is a concise way to create lists using a single line of code, often replacing for loops. It consists of an expression followed by a for clause and optional if clauses.
It improves readability and performance for simple iterations.
# Example: Squares of even numbers
squares = [x**2 for x in range(10) if x % 2 == 0]
print(squares) # Output: [0, 4, 16, 36, 64]
12. What is Dictionary Comprehension? Give an Example.
Dictionary comprehension is similar to list comprehension but creates dictionaries. It uses key:value expressions in a single line.
# Example: Squares with numbers as keys
squares_dict = {x: x**2 for x in range(5)}
print(squares_dict) # Output: {0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
13. Is Tuple Comprehension possible in Python? If yes, how and if not why?
Tuple comprehension is not directly possible in Python using the same syntax as list comprehension because parentheses are used for generator expressions. However, you can create a tuple by passing a generator expression to tuple().
Reason: The syntax (expr for var in iter) creates a generator, not a tuple.
# Not tuple comprehension
gen = (x**2 for x in range(5)) # This is a generator
# To make a tuple
tup = tuple(x**2 for x in range(5))
print(tup) # Output: (0, 1, 4, 9, 16)
14. Differentiate between List and Tuple?
- Mutability: Lists are mutable; tuples are immutable.
- Syntax: Lists use []; tuples use ().
- Performance: Tuples are faster due to immutability.
- Use Cases: Lists for changeable data; tuples for fixed data or as dict keys.
- Methods: Lists have more methods (append, remove); tuples have fewer (count, index).
lst = [1, 2]; lst[0] = 0 # OK
tup = (1, 2); # tup[0] = 0 # Error
15. What is slicing in Python?
Slicing is a way to extract a subset of elements from sequence types like lists, tuples, or strings using the syntax [start:stop:step]. Start is inclusive, stop is exclusive.
It creates a new object without modifying the original.
my_list = [0, 1, 2, 3, 4]
subset = my_list[1:4] # [1, 2, 3]
every_other = my_list[::2] # [0, 2, 4]
16. What is init() in Python?
The __init__() method is a special method (dunder method) called a constructor. It is automatically invoked when a new instance of a class is created. It initializes the object's attributes.
class Person:
def __init__(self, name):
self.name = name # Initialize attribute
p = Person("Alice")
print(p.name) # Output: Alice
17. What is a lambda function?
A lambda function is an anonymous, single-expression function defined using the lambda keyword. It is used for short, throwaway functions, often in higher-order functions like map() or filter().
add = lambda x, y: x + y
print(add(2, 3)) # Output: 5
# With map
numbers = [1, 2, 3]
squares = list(map(lambda x: x**2, numbers)) # [1, 4, 9]
18. What are *args and **kwargs?
*args allows a function to accept any number of positional arguments as a tuple.
**kwargs allows any number of keyword arguments as a dictionary.
They provide flexibility in function definitions.
def func(*args, **kwargs):
print(args) # Tuple of positional args
print(kwargs) # Dict of keyword args
func(1, 2, a=3, b=4) # (1, 2) {'a': 3, 'b': 4}
19. What is pass in Python?
The pass statement is a null operation; it does nothing when executed. It is used as a placeholder in functions, loops, or classes where code will be added later, to avoid syntax errors.
def func():
pass # Placeholder, no action
class MyClass:
pass # Empty class definition
20. What is a break, continue and pass in Python?
break: Terminates the nearest enclosing loop prematurely.
continue: Skips the rest of the current loop iteration and proceeds to the next.
pass: Does nothing; used as a placeholder.
for i in range(5):
if i == 2:
continue # Skip 2
if i == 4:
break # Stop at 4
print(i) # Output: 0 1 3
# pass example
if True:
pass # No action
Common pitfalls: Using break/continue outside loops causes errors.
21. How are arguments passed by value or by reference in Python?
Python uses a mechanism called "pass by object reference" or "call by sharing." Neither purely pass-by-value nor pass-by-reference, it passes references to objects. If the object is mutable (like lists), modifications inside the function affect the original. For immutable objects (like integers), changes create new objects without affecting the original.
# Mutable example
def modify_list(lst):
lst.append(4) # Modifies original
my_list = [1, 2, 3]
modify_list(my_list)
print(my_list) # Output: [1, 2, 3, 4]
# Immutable example
def modify_int(x):
x = x + 1 # Creates new object, original unchanged
num = 10
modify_int(num)
print(num) # Output: 10
Common pitfall: Assuming pass-by-value for mutables can lead to unexpected side effects.
22. What are decorators in Python?
Decorators are functions that modify the behavior of other functions or methods without changing their code. They wrap the target function, adding functionality before, after, or around it. Defined using @decorator_name syntax.
def decorator(func):
def wrapper(*args, **kwargs):
print("Before function")
result = func(*args, **kwargs)
print("After function")
return result
return wrapper
@decorator
def greet(name):
print(f"Hello, {name}!")
greet("Alice") # Output: Before function\nHello, Alice!\nAfter function
Useful for logging, authentication, or timing functions.
23. What are generators in Python?
Generators are functions that yield values one at a time using the yield keyword, creating iterable sequences lazily. They save memory by not storing all values at once and maintain state between yields.
def count_up_to(n):
i = 1
while i <= n:
yield i # Yield value, pause execution
i += 1
for num in count_up_to(3):
print(num) # Output: 1\n2\n3
Generators are memory-efficient for large datasets.
24. What are iterators in Python?
Iterators are objects that implement the iterator protocol with __iter__() and __next__() methods. They allow traversing collections like lists or custom objects. Calling next() retrieves the next item until StopIteration is raised.
my_list = [1, 2, 3]
iterator = iter(my_list) # Get iterator
print(next(iterator)) # 1
print(next(iterator)) # 2
print(next(iterator)) # 3
# next(iterator) # Raises StopIteration
Custom iterators can be created by defining __iter__ and __next__ in classes.
25. What is the difference between xrange and range functions?
In Python 2, range() returns a list, while xrange() returns an iterator (lazy evaluation, memory-efficient). In Python 3, range() behaves like xrange() (returns range object), and xrange() is removed.
# Python 3
r = range(5) # range(0, 5) - iterator-like
print(list(r)) # [0, 1, 2, 3, 4]
# In Python 2: range(5) would directly return [0,1,2,3,4]
Use range() in Python 3 for efficiency with large ranges.
26. How is memory managed in Python?
Python uses automatic memory management with reference counting and garbage collection. Reference counting tracks object references; when count reaches zero, memory is freed. Cyclic references are handled by the garbage collector (gc module).
import gc
# Manual garbage collection
gc.collect() # Returns number of unreachable objects collected
Private heap space manages all objects, with tools like sys.getsizeof() for size checks.
27. What is a namespace in Python?
A namespace is a mapping from names to objects, like a dictionary for variable scopes. Types: built-in (e.g., print), global (module-level), local (function-level), and enclosing (nested functions).
x = 10 # Global namespace
def outer():
y = 20 # Enclosing for inner
def inner():
z = 30 # Local
print(x, y, z) # Accesses global, enclosing, local
inner()
outer() # Output: 10 20 30
Namespaces prevent name conflicts and manage scope.
28. What are modules and packages in Python?
A module is a single .py file containing functions, classes, and variables. A package is a directory of modules with an __init__.py file, allowing hierarchical organization.
# Importing module
import math
print(math.pi) # 3.141592653589793
# Package example: mypackage/module.py
# from mypackage import module
Packages enable large-scale code organization.
29. What is PIP?
PIP (Pip Installs Packages) is Python's package installer, used to install, upgrade, and manage libraries from PyPI (Python Package Index).
# Install a package (run in terminal)
# pip install requests
# Upgrade pip
# pip install --upgrade pip
Common commands: pip list, pip uninstall, pip freeze > requirements.txt.
30. What is pickling and unpickling?
Pickling serializes Python objects to byte streams (using pickle module) for storage or transmission. Unpickling deserializes them back.
import pickle
data = {'a': 1, 'b': 2}
# Pickling
with open('data.pkl', 'wb') as f:
pickle.dump(data, f)
# Unpickling
with open('data.pkl', 'rb') as f:
loaded = pickle.load(f)
print(loaded) # {'a': 1, 'b': 2}
Security warning: Avoid unpickling untrusted data.
31. What is the difference between a shallow copy and a deep copy?
Shallow copy creates a new object but references the same nested objects (using copy.copy()). Deep copy creates a new object and recursively copies all nested objects (copy.deepcopy()).
import copy
lst = [[1, 2], 3]
shallow = copy.copy(lst)
shallow[0][0] = 0 # Affects original nested list
print(lst) # [[0, 2], 3]
deep = copy.deepcopy(lst)
deep[0][0] = 5 # Doesn't affect original
print(lst) # [[0, 2], 3]
Use deep copy for nested mutables.
32. How do you debug a Python program?
Debug using print statements, pdb module (set breakpoints with pdb.set_trace()), IDE debuggers (VS Code, PyCharm), or logging. Tools like ipdb enhance pdb.
import pdb
def func(x):
pdb.set_trace() # Breakpoint
return x * 2
func(5) # Enter debugger: (Pdb) commands like n (next), c (continue)
Common commands: l (list), p (print variable).
33. What is Polymorphism in Python?
Polymorphism allows objects of different classes to be treated as the same type, responding to the same method calls differently (method overriding or operator overloading).
class Dog:
def sound(self):
return "Woof"
class Cat:
def sound(self):
return "Meow"
def make_sound(animal):
print(animal.sound())
make_sound(Dog()) # Woof
make_sound(Cat()) # Meow
Enables flexible, interchangeable code.
34. Define encapsulation in Python?
Encapsulation bundles data and methods within a class, restricting direct access to internals using access modifiers like _ (protected) or __ (private). It hides implementation details.
class Person:
def __init__(self, name):
self.__name = name # Private attribute
def get_name(self):
return self.__name # Getter
p = Person("Alice")
# print(p.__name) # AttributeError
print(p.get_name()) # Alice
Promotes data integrity and modularity.
35. How do you do data abstraction in Python?
Data abstraction hides complex implementation details, exposing only essential features via abstract classes (abc module) or interfaces.
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self):
pass
class Circle(Shape):
def __init__(self, r):
self.r = r
def area(self):
return 3.14 * self.r ** 2
c = Circle(5)
print(c.area()) # 78.5
Ensures subclasses implement required methods.
36. Does Python supports multiple Inheritance?
Yes, Python supports multiple inheritance, where a class can inherit from multiple base classes. Method Resolution Order (MRO) resolves conflicts using C3 linearization.
class A:
def method(self):
print("A")
class B:
def method(self):
print("B")
class C(A, B):
pass
c = C()
c.method() # A (follows MRO: C -> A -> B)
print(C.__mro__) # (<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>)
Diamond problem handled by MRO.
37. What is Exceptional handling done in Python?
Exception handling manages runtime errors using try-except-else-finally blocks. Try runs code, except catches specific exceptions, else runs if no exception, finally always executes.
try:
x = 1 / 0
except ZeroDivisionError:
print("Division by zero!") # Handles error
else:
print("No error")
finally:
print("Always runs") # Cleanup
# Output: Division by zero!\nAlways runs
Custom exceptions via subclassing Exception.
38. What are unit tests in Python?
Unit tests are small, isolated tests for individual code units (functions, methods) using unittest module. They ensure code works as expected.
import unittest
def add(a, b):
return a + b
class TestAdd(unittest.TestCase):
def test_add(self):
self.assertEqual(add(2, 3), 5)
if __name__ == '__main__':
unittest.main() # Run tests
Methods like assertEqual, assertRaises for assertions.
39. What is Python Global Interpreter Lock (GIL)?
GIL is a mutex that protects access to Python objects, allowing only one thread to execute Python bytecode at a time in CPython. It simplifies memory management but limits multi-threading for CPU-bound tasks.
Workarounds: Multiprocessing, or libraries like NumPy that release GIL.
import threading
def task():
for _ in range(10**7):
pass
threads = [threading.Thread(target=task) for _ in range(2)]
for t in threads:
t.start()
for t in threads:
t.join() # GIL makes this sequential for CPU-bound
GIL removal discussed in Python 3.13+.
40. What are Function Annotations in Python?
Function annotations add metadata to parameters and return values (PEP 3107). They are optional, stored in __annotations__, often used for type hints.
def greet(name: str) -> str:
return f"Hello, {name}"
print(greet.__annotations__) # {'name': <class 'str'>, 'return': <class 'str'>}
Tools like mypy use them for static type checking.
41. What is Walrus Operator?
The Walrus Operator, introduced in Python 3.8 (PEP 572), is :=, which assigns a value to a variable as part of an expression. It allows assignment within conditions or loops, reducing code duplication.
# Without walrus
n = len(my_list)
if n > 10:
print(n)
# With walrus
if (n := len(my_list)) > 10:
print(n) # Outputs length if > 10
Use sparingly to maintain readability.
42. What is Python Switch Statement?
Python does not have a traditional switch statement like C/C++. Instead, use if-elif-else chains or dictionaries for mapping. From Python 3.10 (PEP 634), structural pattern matching with match-case provides similar functionality.
# Using if-elif
def get_day(num):
if num == 1:
return "Monday"
elif num == 2:
return "Tuesday"
else:
return "Invalid"
# Using match (Python 3.10+)
def get_day(num):
match num:
case 1:
return "Monday"
case 2:
return "Tuesday"
case _:
return "Invalid"
Dictionaries for simple cases: days = {1: "Monday", 2: "Tuesday"}
43. What are Exception Groups in Python?
Exception Groups, introduced in Python 3.11 (PEP 654), allow raising and handling multiple unrelated exceptions together using ExceptionGroup and BaseExceptionGroup. Useful in concurrent programming.
try:
raise ExceptionGroup("Errors", [ValueError("val"), TypeError("type")])
except* ValueError as e:
print("Handled ValueError:", e.exceptions)
except* TypeError as e:
print("Handled TypeError:", e.exceptions)
except* handles subgroups of exceptions.
44. What is docstring in Python?
A docstring is a string literal specified in source code that documents modules, classes, functions, or methods. It is accessed via __doc__ attribute and used by tools like help().
def add(a, b):
"""Adds two numbers.
Args:
a (int): First number
b (int): Second number
Returns:
int: Sum
"""
return a + b
print(add.__doc__) # Outputs the docstring
Styles: Google, NumPy, reStructuredText.
45. What is the difference between Python Arrays and Lists?
Lists are built-in, dynamic, heterogeneous collections. Arrays (from array module) are homogeneous, fixed-type, more memory-efficient for primitives.
import array
my_list = [1, "a", 3.5] # Heterogeneous list
my_array = array.array('i', [1, 2, 3]) # Homogeneous int array
# my_array.append("a") # TypeError
Use arrays for performance with large numeric data; lists for flexibility.
46. What is the use of self in Python?
self refers to the instance of the class in methods. It is the first parameter in instance methods, allowing access to attributes and methods.
class Dog:
def __init__(self, name):
self.name = name # Instance attribute
def bark(self):
print(f"{self.name} says Woof!") # Access via self
d = Dog("Buddy")
d.bark() # Buddy says Woof!
self is conventional; can be any name, but don't change it.
47. What are global, protected and private attributes in Python?
Global: Module-level variables. Protected: Prefixed with _ (convention for internal use). Private: Prefixed with __ (name mangling to _Class__attr).
global_var = 10 # Global
class MyClass:
def __init__(self):
self._protected = 20 # Protected
self.__private = 30 # Private (mangled to _MyClass__private)
No strict enforcement; conventions for encapsulation.
48. What is Scope in Python?
Scope defines variable accessibility: Local (function), Enclosing (nested functions), Global (module), Built-in (Python keywords).
x = 10 # Global
def outer():
y = 20 # Enclosing
def inner():
z = 30 # Local
print(x, y, z) # LEGB rule
inner()
outer() # 10 20 30
LEGB rule: Local -> Enclosing -> Global -> Built-in.
49. Explain how can you make a Python Script executable on Unix?
Add shebang line #!/usr/bin/env python at top. Make file executable with chmod +x script.py. Run as ./script.py.
#!/usr/bin/env python
# script.py
print("Hello")
Terminal: chmod +x script.py; ./script.py
Ensure Python in PATH.
50. What is the use of help() and dir() functions?
help(): Displays documentation (docstrings). dir(): Lists attributes/methods of an object.
import math
help(math.sin) # Shows sin doc
dir(math) # Lists math module attributes
Useful for exploration/interactive sessions.
51. What are metaclasses in Python?
Metaclasses are classes of classes, defining class behavior. type is default metaclass. Custom metaclasses override class creation.
class Meta(type):
def __new__(cls, name, bases, dct):
dct['added_attr'] = 100
return super().__new__(cls, name, bases, dct)
class MyClass(metaclass=Meta):
pass
print(MyClass.added_attr) # 100
For frameworks, advanced customization.
52. What is monkey patching in Python?
Monkey patching dynamically modifies modules/classes at runtime, often for testing or bug fixes.
import math
def fake_sin(x):
return 0
math.sin = fake_sin # Patch sin
print(math.sin(3.14)) # 0
Use cautiously; can break code.
53. What is the difference between @classmethod, @staticmethod and instance methods in Python?
Instance methods: Take self, access instance. @classmethod: Take cls, access class, for factories. @staticmethod: No self/cls, utility functions.
class MyClass:
def instance_method(self):
return self # Instance access
@classmethod
def class_method(cls):
return cls # Class access
@staticmethod
def static_method():
return "Static" # No access
54. How does Python handle memory management, and what role does garbage collection play?
Python uses reference counting: frees when count zero. Garbage collection (gc module) handles cyclic references via generational collection.
import gc
gc.collect() # Manual collection
Private heap; automatic deallocation.
55. What is the difference between shallow copy and deep copy in Python, and when would you use each?
Shallow: Copies top-level, shares nested (copy.copy()). Deep: Recursively copies all (copy.deepcopy()). Use shallow for simple structures; deep for nested mutables to avoid side effects.
import copy
lst = [[1]]
shallow = copy.copy(lst)
shallow[0][0] = 2 # Affects original
deep = copy.deepcopy(lst)
deep[0][0] = 3 # Independent
56. What are context managers in Python, and how are they implemented?
Context managers handle resource setup/teardown (e.g., files) using with. Implement with __enter__ and __exit__ methods.
class MyContext:
def __enter__(self):
print("Enter")
return self
def __exit__(self, exc_type, exc_val, exc_tb):
print("Exit")
with MyContext():
print("Inside") # Enter\nInside\nExit
Or use @contextmanager decorator.
57. What is NumPy, and why is it used in Python?
NumPy is a library for numerical computing, providing n-dimensional arrays, mathematical functions, and tools for scientific computation. Used for efficiency in data manipulation, analysis, and machine learning.
Advantages: Vectorized operations, broadcasting, integration with other libraries.
58. How do you create a NumPy array from a Python list?
Use np.array() to convert a list to a NumPy ndarray.
import numpy as np
my_list = [1, 2, 3]
arr = np.array(my_list) # 1D array
print(arr) # [1 2 3]
# 2D
arr2d = np.array([[1, 2], [3, 4]])
Specify dtype if needed: np.array(my_list, dtype=float).
59. What are the advantages of NumPy over regular Python lists?
- Efficiency: Homogeneous, contiguous memory.
- Vectorization: Faster operations without loops.
- Broadcasting: Operates on different shapes.
- Mathematical functions: Built-in like sum, mean.
- Multidimensional support.
import numpy as np
arr = np.arange(1000000)
%timeit arr.sum() # Faster than sum(list(range(1000000)))
60. How do you find the shape of any given NumPy array?
Use the shape attribute, returning a tuple of dimensions.
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape) # (2, 3)
For reshaping: arr.reshape((3, 2)).
61. What is broadcasting in NumPy?
Broadcasting is a mechanism that allows NumPy to perform element-wise operations on arrays of different shapes by automatically expanding the smaller array to match the larger one's dimensions without copying data. It follows rules: dimensions must be compatible (equal or one is 1), and trailing dimensions are aligned.
import numpy as np
a = np.array([1, 2, 3]) # Shape (3,)
b = np.array([[10], [20], [30]]) # Shape (3,1)
result = a + b # Broadcasts a to (3,3), b to (3,3)
print(result)
# Output:
# [[11 12 13]
# [21 22 23]
# [31 32 33]]
Pitfall: Incompatible shapes raise ValueError.
62. What do you know about pandas?
Pandas is an open-source Python library for data manipulation and analysis, providing data structures like Series (1D), DataFrame (2D), and Panel (3D, deprecated). It excels in handling structured data, time series, and integrates with NumPy, Matplotlib, and Scikit-Learn. Key features: data cleaning, merging, grouping, and I/O for CSV, Excel, SQL.
import pandas as pd
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
print(df)
# A B
# 0 1 3
# 1 2 4
63. Define pandas dataframe.
A Pandas DataFrame is a 2D labeled data structure with columns of potentially different types, like a spreadsheet or SQL table. It has rows (index) and columns, supports operations like filtering, grouping, and pivoting.
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data, index=['row1', 'row2'])
print(df)
# Name Age
# row1 Alice 25
# row2 Bob 30
64. How will you combine different pandas dataframes?
Combine DataFrames using pd.concat() for stacking (axis=0 rows, axis=1 columns), pd.merge() for SQL-like joins, or df.append() (deprecated, use concat).
import pandas as pd
df1 = pd.DataFrame({'A': [1, 2]})
df2 = pd.DataFrame({'A': [3, 4]})
concat_rows = pd.concat([df1, df2]) # Stacks rows
df3 = pd.DataFrame({'B': [5, 6]})
concat_cols = pd.concat([df1, df3], axis=1) # Adds columns
merged = pd.merge(df1, df3, left_index=True, right_index=True) # Join on index
65. How will you identify and deal with missing values in a dataframe?
Identify with df.isnull() or df.isna(), count with df.isnull().sum(). Deal with: drop via df.dropna(), fill with df.fillna(value), or interpolate df.interpolate().
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3]})
print(df.isnull().sum()) # A: 1
df.fillna(0, inplace=True) # Fill NaN with 0
df.dropna() # Drop rows with NaN
Choose method based on data context; filling with mean for numerical data.
66. What do you understand by reindexing in pandas?
Reindexing changes the index labels of a DataFrame or Series, conforming to a new index with optional filling. Use df.reindex(new_index) to reorder or add/remove indices.
import pandas as pd
df = pd.DataFrame({'A': [1, 2]}, index=['a', 'b'])
reindexed = df.reindex(['b', 'c'], fill_value=0)
print(reindexed)
# A
# b 2
# c 0
67. How to add new column to pandas dataframe?
Add via df['new_col'] = values, or df.assign(new_col=values) for immutable addition.
import pandas as pd
df = pd.DataFrame({'A': [1, 2]})
df['B'] = [3, 4] # Adds column B
df = df.assign(C=lambda x: x['A'] + x['B']) # Derived column
68. How will you delete indices, rows and columns from a dataframe?
Delete rows: df.drop(labels, axis=0). Columns: df.drop(labels, axis=1). Reset index: df.reset_index(drop=True). Delete index: df.set_index(new_col) or drop index column after reset.
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3]}, index=['x', 'y', 'z'])
df.drop('y', axis=0, inplace=True) # Delete row 'y'
df.drop('A', axis=1, inplace=True) # Delete column 'A'
df.reset_index(drop=True, inplace=True) # Reset index
69. Can you get items of series A that are not available in another series B?
Yes, use ~A.isin(B) to filter A.
import pandas as pd
A = pd.Series([1, 2, 3, 4])
B = pd.Series([3, 4, 5])
unique_A = A[~A.isin(B)] # [1, 2]
70. How will you get the items that are not common to both the given series A and B?
Use symmetric difference with pd.concat([A[~A.isin(B)], B[~B.isin(A)]]) or set operations.
import pandas as pd
A = pd.Series([1, 2, 3])
B = pd.Series([3, 4, 5])
not_common = pd.concat([A[~A.isin(B)], B[~B.isin(A)]]).sort_values()
# 1,2,4,5
71. While importing data from different sources, can the pandas library recognize dates?
Yes, using parse_dates=True in pd.read_csv() or similar, or pd.to_datetime() post-import.
import pandas as pd
# Assuming CSV with 'date' column
df = pd.read_csv('file.csv', parse_dates=['date'])
# Or
df['date'] = pd.to_datetime(df['date'])
72. What is the difference between merge, join, and concatenate?
Merge: SQL-like join on columns/keys (pd.merge). Join: Index-based merge (df.join). Concatenate: Stack along axis (pd.concat).
import pandas as pd
df1 = pd.DataFrame({'key': [1], 'A': ['a']})
df2 = pd.DataFrame({'key': [1], 'B': ['b']})
merged = pd.merge(df1, df2, on='key') # Join on key
df3 = pd.DataFrame({'C': ['c']}, index=[0])
joined = df1.join(df3) # Index join
concat = pd.concat([df1, df2], axis=1) # Concat columns
73. How do you group data in a DataFrame and apply an aggregation function?
Use df.groupby(columns).agg(func) or specific like .sum(), .mean().
import pandas as pd
df = pd.DataFrame({'Group': ['A', 'A', 'B'], 'Value': [1, 2, 3]})
grouped = df.groupby('Group').agg({'Value': 'sum'})
# Value
# Group
# A 3
# B 3
74. How do you handle missing data in a dataset before training a model?
Identify: isnull(). Strategies: Drop (dropna()), Impute (fillna() with mean/median/mode, or sklearn.impute.SimpleImputer), Forward/backward fill (ffill/bfill).
from sklearn.impute import SimpleImputer
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [1, np.nan, 3]})
imputer = SimpleImputer(strategy='mean')
df['A'] = imputer.fit_transform(df[['A']])
Choice depends on missingness mechanism (MCAR, MAR, MNAR).
75. What is the difference between fit() and transform() in Scikit-Learn?
fit(): Learns parameters from data (e.g., mean in scaler). transform(): Applies learned transformation. fit_transform(): Both in one.
from sklearn.preprocessing import StandardScaler
import numpy as np
data = np.array([[1, 2], [3, 4]])
scaler = StandardScaler()
scaler.fit(data) # Computes mean, std
transformed = scaler.transform(data) # Scales
76. How do you perform cross-validation in Python?
Use sklearn.model_selection.cross_validate or KFold. Splits data, trains/evaluates multiple times.
from sklearn.model_selection import cross_validate
from sklearn.linear_model import LinearRegression
import numpy as np
X = np.array([[1], [2], [3]])
y = np.array([1, 2, 3])
model = LinearRegression()
scores = cross_validate(model, X, y, cv=3) # 3-fold CV
77. What are the main steps in preprocessing text data for NLP tasks?
- Tokenization: Split into words/tokens.
- Lowercasing: Convert to lowercase.
- Remove punctuation/stopwords.
- Stemming/Lemmatization: Reduce to root forms.
- Vectorization: TF-IDF, Word2Vec.
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import string
text = "Hello, world!"
tokens = word_tokenize(text.lower())
filtered = [w for w in tokens if w not in stopwords.words('english') and w not in string.punctuation]
78. How do you create a scatter plot using Matplotlib?
Use plt.scatter(x, y), customize with color, size, etc.
import matplotlib.pyplot as plt
x = [1, 2, 3]
y = [4, 5, 6]
plt.scatter(x, y, color='red', marker='o')
plt.title('Scatter Plot')
plt.show()
79. How do you plot multiple subplots in Matplotlib?
Use plt.subplots(nrows, ncols) to create grid, then ax[i,j].plot().
import matplotlib.pyplot as plt
import numpy as np
fig, axs = plt.subplots(2, 1)
axs[0].plot(np.arange(10))
axs[1].scatter(np.random.rand(10), np.random.rand(10))
plt.show()
80. What is the multiprocessing module and how is it different from threading?
Multiprocessing creates separate processes for parallelism, bypassing GIL for CPU-bound tasks. Threading uses threads within one process, GIL-limited, better for I/O-bound.
from multiprocessing import Process
def task():
print("Process")
p = Process(target=task)
p.start()
p.join()
# vs threading.Thread
Multiprocessing has overhead from process creation/communication.
81. How do you use the asyncio module?
The asyncio module provides support for writing single-threaded concurrent code using coroutines, multiplexing I/O access over sockets and other resources, and implementing network clients and servers.
To use it, define coroutines with async def, use await for asynchronous calls, and run the event loop with asyncio.run().
import asyncio
async def fetch_data():
print("Fetching...")
await asyncio.sleep(1) # Simulate async I/O
return "Data"
async def main():
data = await fetch_data()
print(data)
asyncio.run(main()) # Run the event loop
Common pitfall: Forgetting to await a coroutine leads to runtime warnings.
82. What is a coroutine in Python?
A coroutine is a special function defined with async def that can be paused and resumed, allowing concurrent execution without threads. It uses await to suspend execution until an awaitable (like another coroutine or Future) completes.
import asyncio
async def my_coro():
print("Start")
await asyncio.sleep(1)
print("End")
asyncio.run(my_coro())
Coroutines are building blocks for asyncio-based concurrency.
83. What are Python descriptors?
Descriptors are objects that define how attribute access is handled via __get__, __set__, __delete__ methods. They power properties, methods, and classmethods.
class Descriptor:
def __get__(self, instance, owner):
return "Value"
class MyClass:
attr = Descriptor()
obj = MyClass()
print(obj.attr) # Value
Useful for computed attributes or validation.
84. How do you create a custom exception in Python?
Create by subclassing Exception or a subclass. Add custom attributes or methods if needed.
class CustomError(Exception):
def __init__(self, message, code):
super().__init__(message)
self.code = code
try:
raise CustomError("Error occurred", 404)
except CustomError as e:
print(e.code) # 404
Base on appropriate built-in exception for specificity.
85. What is method resolution order (MRO) in Python?
MRO determines the order in which base classes are searched for a method in multiple inheritance. Python uses C3 linearization, accessible via __mro__ or mro().
class A: pass
class B(A): pass
class C(A): pass
class D(B, C): pass
print(D.__mro__) # (<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>)
Resolves diamond problem consistently.
86. How do you manage package dependencies in Python?
Use pip for installation, requirements.txt for listing (pip freeze > requirements.txt), virtual environments for isolation. Tools like poetry or pipenv handle dependencies declaratively.
# Install from requirements.txt
# pip install -r requirements.txt
# With poetry (in pyproject.toml)
# poetry add requests
Check conflicts with pip check.
87. What is the collections module and what does it provide?
The collections module provides specialized container datatypes: namedtuple, deque, Counter, OrderedDict, defaultdict.
from collections import Counter, defaultdict
c = Counter("hello")
print(c) # Counter({'h': 1, 'e': 1, 'l': 2, 'o': 1})
d = defaultdict(list)
d['key'].append(1) # No KeyError
Enhances built-in dict, list, tuple.
88. How do you handle concurrency in Python?
Use threading for I/O-bound (GIL limits CPU), multiprocessing for CPU-bound, asyncio for async I/O. Libraries like concurrent.futures simplify.
from concurrent.futures import ThreadPoolExecutor
def task(n):
return n * 2
with ThreadPoolExecutor() as executor:
results = list(executor.map(task, range(5)))
print(results) # [0, 2, 4, 6, 8]
GIL affects threading for CPU tasks.
89. What is the typing module used for?
The typing module provides support for type hints (PEP 484), including TypeVar, Generic, Union, Optional, etc., for static type checking.
from typing import List, Optional
def func(numbers: List[int]) -> Optional[str]:
return "Done" if numbers else None
Doesn't affect runtime; used by mypy.
90. How do you implement a singleton pattern in Python?
Use a metaclass, module-level variable, or decorator. Common: override __new__.
class Singleton:
_instance = None
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
s1 = Singleton()
s2 = Singleton()
print(s1 is s2) # True
Thread-safe with locks if needed.
91. What is the purpose of the abc module?
The abc (Abstract Base Classes) module defines abstract classes and methods, enforcing implementation in subclasses.
from abc import ABC, abstractmethod
class Shape(ABC):
@abstractmethod
def area(self):
pass
class Circle(Shape):
def area(self):
return 3.14 # Must implement
Prevents instantiation of abstract classes.
92. What is the purpose of the dataclasses module?
The dataclasses module (Python 3.7+) auto-generates special methods like __init__, __repr__, __eq__ for classes.
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
p = Point(1, 2)
print(p) # Point(x=1, y=2)
Reduces boilerplate for data holders.
93. How do you perform static type checking in Python?
Use tools like mypy, pyright, or PyCharm's checker. Add type hints, run mypy file.py.
# Install: pip install mypy
# Run: mypy script.py
def add(a: int, b: int) -> int:
return a + b
Catches type errors pre-runtime.
94. What is duck typing in Python?
Duck typing checks for method/attribute presence at runtime, not type. "If it quacks like a duck..."
def make_sound(animal):
animal.sound() # Assumes .sound() exists
class Dog:
def sound(self):
print("Woof")
make_sound(Dog()) # Works
Promotes flexibility over strict interfaces.
95. How do you handle circular imports in Python?
Refactor code: move shared code to third module, import inside functions, or use importlib.import_module(). Avoid mutual dependencies.
# a.py
def func_a():
from b import func_b
func_b()
# b.py
def func_b():
from a import func_a
func_a()
Circular imports cause ImportError; redesign modules.
96. What is the purpose of the enum module?
The enum module provides Enum, IntEnum for creating enumerated constants, which are immutable and hashable.
from enum import Enum
class Color(Enum):
RED = 1
GREEN = 2
print(Color.RED) # Color.RED
print(Color.RED.value) # 1
Improves readability over magic numbers.
97. What is the purpose of the itertools module?
The itertools module provides iterator functions for efficient looping: count, cycle, product, permutations, combinations, etc.
import itertools
for combo in itertools.combinations([1, 2, 3], 2):
print(combo) # (1,2) (1,3) (2,3)
counter = itertools.count(start=5, step=2) # Infinite: 5,7,9,...
Memory-efficient for large iterations.
98. What are the differences between Python 2 and Python 3?
- Print: Function in 3 (print("hi")), statement in 2 (print "hi").
- Division: Integer / returns float in 3, floor in 2.
- Unicode: Strings are Unicode in 3, ASCII in 2 (u"unicode").
- xrange: range in 3 is like xrange in 2.
- Exceptions: as keyword in 3, comma in 2.
- Iterators: next() in 3, .next() in 2.
- Metaclasses: Syntax differences.
Python 2 EOL; migrate to 3.
99. How do you create a virtual environment in Python?
Use venv module: python -m venv env_name. Activate: source env_name/bin/activate (Unix) or env_name\Scripts\activate (Windows).
# Create
python -m venv myenv
# Activate (Unix)
source myenv/bin/activate
# Deactivate
deactivate
Isolates packages per project.
100. What is the purpose of the pip tool?
Pip is the package installer for Python, used to install, upgrade, uninstall packages from PyPI or local files.
# Install
pip install requests
# Upgrade
pip install --upgrade pip
# Uninstall
pip uninstall requests
# List
pip list
Manages dependencies efficiently.


