Important Topics and Notes for Data Structures and Algorithms Interview

Zeeshan Shaikh
11 min readJul 7, 2021

Data Structures

Chapter 1 I Arrays and Strings

HashTables.

ArrayList & Resizable Arrays

StringBuilder.

Chapter 2 I Linked Lists

Creating a Linked List

Deleting a Node from a Singly Linked List

The “Runner” Technique

Recursive Problems

Chapter 3 | Stacks and Queues

Implementing a Stack

Implementing a Queue

Chapter 4 | Trees and Graphs

Types of Trees

Binary Tree Traversal

Binary Heaps (Min-Heaps and Mox-Heaps)

Tries (Prefix Trees)

Graphs

Graph Search

Algorithms

Chapter 5 | Bit Manipulation

Bit Manipulation By Hand

Bit Facts and Tricks

Two’s Complement and Negative Numbers

Arithmetic vs. Logical Right Shift

Common Bit Tasks: Getting and Setting

Chapter 6 | Math and Logic Puzzles

Prime Numbers

Probability

Develop Rules and Patterns

Worst Case Shifting

Chapter 7 | Object-Oriented Design

Chapter 8 | Recursion and Dynamic Programming

Recursive vs. Iterative Solutions

Dynamic Programming & Memoization

Chapter 9 I System Design and Scalability

Chapter 10 | Sorting and Searching

Sorting

  1. Bubble
  2. Selection
  3. merge
  4. quick
  5. Bucket
  6. radix

Searching

  1. Linear
  2. Binary
  3. Jump
  4. Interpolation
  5. Exponential

Reference for topics: Cracking the Coding Interview by Gayle L McDowell

Important Notes:

Arrays

Array: Arrays are defined as the collection of similar type of data items stored at contiguous memory locations.

2D Array: 2D array can be defined as an array of arrays. The 2D array is organized as matrices which can be represented as the collection of rows and columns.

int arr[max_rows][max_columns];

Linked List

Linked List:

  • A linked list is also a collection of elements, but the elements are not stored in a consecutive location.
  • These elements are linked to each other by providing one additional information along with an element, i.e., the address of the next element.
  • The variable that stores the address of the next element is known as a pointer.
  • Therefore, we conclude that the linked list contains two parts, i.e., the first one is the data element, and the other is the pointer.

Advantages of using a Linked list over Array

  1. Dynamic data structure:

The size of the linked list is not fixed as it can vary according to our requirements.

2. Insertion and Deletion:

Insertion and deletion in the linked lists are easier than the array as the elements in an array are stored8 in a consecutive location. In contrast, in the case of a linked list, the elements are stored in a random location. The complexity for insertion and deletion of elements from the beginning is O(1) in the linked list, while in the case of an array, the complexity would be O(n). If we want to insert or delete the element in an array, then we need to shift the elements for creating the space. On the other hand, in the linked list, we do not have to shift the elements. In the linked list, we just need to update the address of the pointer in the node.

3. Memory efficient

Its memory consumption is efficient as the size of the linked list can grow or shrink according to our requirements.

Implementation

4. Both the stacks and queues can be implemented using a linked list.

Disadvantages of Linked list

  1. Memory usage

The node in a linked list occupies more memory than an array as each node occupies two types of variables, i.e., one is a simple variable, and another is a pointer variable that occupies 4 bytes in the memory.

2. Traversal

In a linked list, the traversal is not easy. If we want to access the element in a linked list, we cannot access the element randomly, but in the case of an array, we can randomly access the element by index. For example, if we want to access the 3rd node, then we need to traverse all the nodes before it. So, the time required to access a particular node is large.

3. Reverse traversing

In a linked list, backtracking or reverse traversing is difficult. In a doubly-linked list, it is easier but requires more memory to store the back pointer.

Types of Linked list

  1. Singly Linked list The singly linked list is a data structure that contains two parts, i.e., one is the data part, and the other one is the address part, which contains the address of the next or the successor node. The address part in a node is also known as a pointer.
  2. Doubly Linked list: Doubly linked list is a complex type of linked list in which a node contains a pointer to the previous as well as the next node in the sequence.
  3. Circular Linked list: In a circular Singly linked list, the last node of the list contains a pointer to the first node of the list. We can have a circular singly linked list as well as a circular doubly linked list. We traverse a circular singly linked list until we reach the same node where we started. The circular singly liked list has no beginning and no end. There is no null value present in the next part of any of the nodes.
  4. Doubly Circular Linked list Circular doubly linked list is a more complex type of data structure in which a node contains pointers to its previous node as well as the next node. The circular doubly linked list doesn’t contain NULL in any of the nodes. The last node of the list contains the address of the first node of the list. The first node of the list also contains the address of the last node in its previous pointer.

Binary Tree

  • A tree whose elements have at most 2 children is called a binary tree.
  • Since each element in a binary tree can have only 2 children, we typically name them the left and right child.

Binary Search Tree

Binary Search Tree is a node-based binary tree data structure that has the following properties:

  • The left subtree of a node contains only nodes with keys lesser than the node’s key.
  • The right subtree of a node contains only nodes with keys greater than the node’s key.
  • The left and right subtree each must also be a binary search tree.

Graph

  1. A Graph consists of a finite set of vertices(or nodes) and a set of Edges that connect a pair of nodes.
  2. A Graph is a non-linear data structure consisting of nodes and edges.
  3. A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links. It is composed of two elements — nodes (vertices) and relationships (edges).

A graph database is a database used to model the data in the form of a graph. Here, the nodes of a graph depict the entities while the relationships depict the association of these nodes.

Popular Graph Databases

Neo4j is a popular Graph Database. Other Graph Databases are Oracle NoSQL Database, OrientDB, HypherGraphDB, GraphBase, InfiniteGraph, and AllegroGraph.

Algorithms

Searching

Searching

Searching is the process of finding some particular element in the list. If the element is present in the list, then the process is called successful and the process returns the location of that element, otherwise the search is called unsuccessful.

  1. Linear Search Linear search is the simplest search algorithm and often called a sequential search. In this type of searching, we simply traverse the list completely and match each element of the list with the item whose location is to be found.
  2. Binary Search Binary search is the search technique that works efficiently on sorted lists. Binary search follows divide and conquer approach in which, the list is divided into two halves and the item is compared with the middle element of the list. If the match is found then, the location of the middle element is returned otherwise, we search into either of the halves depending upon the result produced through the match.

Sorting

Selection Sort

The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from the unsorted part and putting it at the beginning. The algorithm maintains two subarrays in a given array.

  1. The subarray is already sorted
  2. The remaining subarray is unsorted.

In every iteration of selection sort, the minimum element (considering ascending order) from the unsorted subarray is picked and moved to the sorted subarray.

Insertion Sort

The array is virtually split into a sorted and an unsorted part. Values from the unsorted part are picked and placed at the correct position in the sorted part.

To sort an array of size n in ascending order:

  1. Iterate from arr[1] to arr[n] over the array.
  2. Compare the current element (key) to its predecessor.
  3. If the key element is smaller than its predecessor, compare it to the elements before. Move the greater elements one position up to make space for the swapped element.

Greedy Algorithms

  1. Travelling Salesman Problem
  2. Prim’s Minimal Spanning Tree Algorithm
  3. Kruskal’s Minimal Spanning Tree Algorithm
  4. Dijkstra’s Minimal Spanning Tree Algorithm
  5. Graph — Map Coloring
  6. Graph — Vertex Cover
  7. Knapsack Problem
  8. Job Scheduling Problem

Traveling Salesman Problem: A traveler needs to visit all the cities from a list, where distances between all the cities are known and each city should be visited just once. What is the shortest possible route that he visits each city exactly once and returns to the origin city?

Minimum Spanning Tree

What is a Spanning Tree?

Given an undirected and connected graph, a spanning tree of the graph is a tree that spans (that is, it includes every vertex of ) and is a subgraph of (every edge in the tree belongs to )

Prim’s Algorithm

Prim’s algorithm is a greedy approach to find the minimum spanning tree. In this algorithm, to form an MST we can start from an arbitrary vertex.

Kruskal’s Algorithm

Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning tree. Kruskal’s algorithm follows a greedy approach as in each iteration it finds an edge that has the least weight and adds it to the growing spanning tree.

Dijkstra’s algorithm (or Dijkstra’s Shortest Path First algorithm, SPF algorithm) is an algorithm for finding the shortest paths between nodes in a graph.

Dynamic Programming

  • Dynamic Programming is mainly an optimization over plain recursion.
  • Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming.
  • The idea is to simply store the results of subproblems so that we do not have to re-compute them when needed later.

What is the difference between memoization and dynamic programming?

  • Memoization is a term describing an optimization technique where you cache previously computed results, and return the cached result when the same computation is needed again.
  • Dynamic programming is a technique for solving problems of recursive nature, iteratively and is applicable when the computations of the subproblems overlap.
  • Dynamic programming is typically implemented using tabulation, but can also be implemented using memoization.

Memoization vs Tabulation

When you solve a dynamic programming problem using tabulation you solve the problem “bottom up”, i.e., by solving all related sub-problems first, typically by filling up an n-dimensional table. Based on the results in the table, the solution to the “top” / original problem is then computed.

If you memoization to solve the problem you do it by maintaining a map of already solved sub problems. You do it “top down” in the sense that you solve the “top” problem first (which typically recurses down to solve the sub-problems).

Backtracking

Backtracking Algorithms meaning- to change your mind about a plan, promise, etc. that you have made)

  • Backtracking is an algorithmic-technique for solving problems recursively by trying to build a solution incrementally, one piece at a time, removing those solutions that fail to satisfy the constraints of the problem at any point of time (by time, here, is referred to the time elapsed till reaching any level of the search tree).
  • Backtracking follows Brute force approach. DFS
  • Branch and Bound also use Brute force. BFS

Example: N-Queens problem is to find an arrangement of N queens on a chessboard, such that no queen can attack any other queens on the board. The chess queens can attack in any direction as a horizontal, vertical, horizontal, and diagonal way. A binary matrix is used to display the positions of N Queens, where no queens can attack other queens.

Divide and Conquer

Divide and Conquer algorithms

  1. Quicksort is a sorting algorithm. The algorithm picks a pivot element, rearranges the array elements in such a way that all elements smaller than the picked pivot element move to the left side of the pivot, and all greater elements move to the right side. Finally, the algorithm recursively sorts the subarrays on the left and right of the pivot element.
  2. Merge Sort is also a sorting algorithm. The algorithm divides the array into two halves, recursively sorts them, and finally merges the two sorted halves.
  3. Closest Pair of Points The problem is to find the closest pair of points in a set of points in the x-y plane. The problem can be solved in O(n²) time by calculating the distances of every pair of points and comparing the distances to find the minimum. The Divide and Conquer algorithm solves the problem in O(nLogn) time.
  4. Strassen’s Algorithm is an efficient algorithm to multiply two matrices. A simple method to multiply two matrices need 3 nested loops and is O(n³). Strassen’s algorithm multiplies two matrices in O(n².8974) time.
  5. Cooley–Tukey Fast Fourier Transform (FFT) algorithm is the most common algorithm for FFT. It is a divide and conquers algorithm which works in O(nlogn) time.
  6. Karatsuba algorithm for fast multiplication does the multiplication of two n-digit numbers in at most

single-digit multiplications in general (and exactly

when n is a power of 2). It is therefore faster than the classical algorithm, which requires n2 single-digit products. If n = 210 = 1024, in particular, the exact counts are 310 = 59, 049 and (210)2 = 1, 048, 576, respectively.

Mathematical Algorithms

Tower of Hanoi

Tower of Hanoi is a mathematical puzzle where we have three rods and n disks. The objective of the puzzle is to move the entire stack to another rod, obeying the following simple rules:

1) Only one disk can be moved at a time.

2) Each move consists of taking the upper disk from one of the stacks and placing it on top of another stack i.e. a disk can only be moved if it is the uppermost disk on a stack.

3) No disk may be placed on top of a smaller disk.

Cyclomatic complexity: is a software metric used to indicate the complexity of a program. It is a quantitative measure of the number of linearly independent paths through a program’s source code.

PS: All images and notes might be subject to copyright and the rights belongs to the respective owners.

--

--

Zeeshan Shaikh

Software Developer with 6+ years of experience in Java and web technologies. Health and Fitness enthusiast.