Python find sequence in list

Also, for efficiency, you can use KMP algorithm that is used in string matching (from here):

def KMPSearch(pat, txt): M = len(pat) N = len(txt) # create lps[] that will hold the longest prefix suffix # values for pattern lps = [0]*M j = 0 # index for pat[] # Preprocess the pattern (calculate lps[] array) computeLPSArray(pat, M, lps) i = 0 # index for txt[] while i < N: if pat[j] == txt[i]: i += 1 j += 1 if j == M: print("Found pattern at index " + str(i-j)) j = lps[j-1] # mismatch after j matches elif i < N and pat[j] != txt[i]: # Do not match lps[0..lps[j-1]] characters, # they will match anyway if j != 0: j = lps[j-1] else: i += 1 def computeLPSArray(pat, M, lps): len = 0 # length of the previous longest prefix suffix lps[0] # lps[0] is always 0 i = 1 # the loop calculates lps[i] for i = 1 to M-1 while i < M: if pat[i]== pat[len]: len += 1 lps[i] = len i += 1 else: # This is tricky. Consider the example. # AAACAAAA and i = 7. The idea is similar # to search step. if len != 0: len = lps[len-1] # Also, note that we do not increment i here else: lps[i] = 0 i += 1 a = [2,3,5,2,5,6,7,2] b = [2,5,6] KMPSearch(b, a)

This find the first index of the b in a. Hence, the range is the result of the search and its plus to the length of b.

\$\begingroup\$

I recently applied for a job as a Python coder but was rejected.

This was the problem:

Write a python code to check if an array has a sequence (1,3,4)

Assuming they were looking for expert Python programmers, what could I have done better?

# Tested with Python 2.7 import unittest # Runtime: O(n) def doesSeqAppear(int_arr): #check if input is a list if not isinstance(int_arr, list): raise TypeError("Input shall be of type array.") # check all elements are of type int if not all(isinstance(item, int) for item in int_arr) : raise ValueError("All elements in array shall be of type int.") arr_len = len(int_arr) if arr_len < 3: return False # Loop through elements for i in range(arr_len-2): if int_arr[i] == 1 and \ int_arr[i+1] == 3 and \ int_arr[i+2] == 4 : return True return False class TestMethodDoesSeqAppear(unittest.TestCase): def test_only_single_seq(self): #Single time assert doesSeqAppear([1,3,4]) == True def test_multiple_seq(self): #multiple assert doesSeqAppear([2,2,1,3,4,2,1,3,4]) == True def test_neg_seq(self): #multiple assert doesSeqAppear([9,-1,1,3,4,-4,4]) == True def test_only_empty_seq(self): #empty assert doesSeqAppear([]) == False def test_only_single_elem_seq(self): #Single element assert doesSeqAppear([1]) == False def test_input_is_none(self): self.assertRaises(TypeError, doesSeqAppear, None) def test_raises_type_error(self): self.assertRaises(TypeError, doesSeqAppear, "string") def test_raises_value_error(self): self.assertRaises(ValueError, doesSeqAppear, [1,2,'a', 'b']) if __name__ == '__main__': unittest.main() #

200_success

143k22 gold badges185 silver badges468 bronze badges

asked Dec 14, 2016 at 13:33

\$\endgroup\$

11

\$\begingroup\$

By PEP 8, doesSeqAppear should be does_seq_appear. You used the right naming convention for your unit tests, though. Personally, I would prefer def contains_seq(arr, seq=[1, 3, 4]).

Your arr_len < 3 test is superfluous and should therefore be eliminated. Don't write a special case when the regular case works correctly and just as quickly.

Your all(isinstance(item, int) for item in int_arr) check was not specified in the problem, and is therefore harmful. The question does not say that doesSeqAppear([3.1, 1, 3, 4]) should return False, nor does it say that it should fail with an exception. In fact, by my interpretation, it does contain the magic sequence and should therefore return True. In any case, you have wasted a complete iteration of the list just to perform a check that wasn't asked for.

Checking isinstance(int_arr, list) is un-Pythonic, since duck-typing is the norm in Python. In any case, the code would likely fail naturally if it is not a list.

After cutting all that excess, you should drop the # Loop through elements comment as well.

answered Dec 14, 2016 at 14:49

200_success200_success

143k22 gold badges185 silver badges468 bronze badges

\$\endgroup\$

4

\$\begingroup\$

Per the problem definition, I would expect a function thas is able to check any sequence in an array. Not necessarily (1, 3, 4) which was given as an example. In this case, the sequence should also be a parameter of the function, giving the signature:

def has_sequence(array, sequence):

Next, I would rely on Python iterations to "check" if array is a list, or at least an iterable. As there is no obvious reasons, to me, that has_sequence('once upon a time', 'e u') should fail. It seems like a valid usecase.

Following, I would use a variation of the itertools recipe pairwise to group elements of array in tuples of the same length than sequence:

import itertools def lookup(iterable, length): tees = itertools.tee(iterable, length) for i, t in enumerate(tees): for _ in xrange(i): next(t, None) return itertools.izip(*tees) def has_sequence(array, sequence): # Convert to tuple for easy testing later sequence = tuple(sequence) return any(group == sequence for group in lookup(array, len(sequence)))

Now, other things that could have been done better:

  • # Tested with Python 2.7 can be replaced by #!/usr/bin/env python2
  • if int_arr[i] == 1 and int_arr[i+1] == 3 and int_arr[i+2] == 4 : can be replaced by if int_arr[i:i+3] == [1, 3, 4]: removing the need for the ugly \
  • assert in unit tests should be replaced by self.assertTrue(…) or self.assertFalse(…)
  • you should be more consistent in your usage of whitespace (putting one after each comma, none before any colon…).

answered Dec 14, 2016 at 15:03

\$\endgroup\$

1

\$\begingroup\$

I think your answer is much too long. Here's mine:

def check_for_1_3_4(seq): return (1, 3, 4) in zip(seq, seq[1:], seq[2:])

Here are some tests:

>>> check_for_1_3_4([1, 3, 4, 5, 6, 7]) True >>> check_for_1_3_4([5, 6, 7, 1, 3, 4]) True >>> check_for_1_3_4([5, 6, 1, 3, 4, 7, 8]) True >>> check_for_1_3_4([1, 3]) False >>> check_for_1_3_4([]) False >>>

My code may seem terse, but it's still readable for anyone who understands slicing and zip. I expect Python experts to at least know about slicing.

Unfortunately for me, my answer is less efficient than yours. It could triple the amount of memory used! By using generators a more efficient but more complicated solution can be created. Instead of creating copies of the sequence, this new code uses only the original sequence, but the logic is nearly the same.

import itertools def check_for_1_3_4(seq): return (1, 3, 4) in itertools.izip(seq, itertools.islice(seq, 1, None), itertools.islice(seq, 2, None))

The tests still pass.

I wouldn't expect most Python programmers to be familiar with itertools, but I was under the impression that Python experts do know it.

answered Dec 15, 2016 at 2:20

DrewDrew

812 bronze badges

\$\endgroup\$

4

\$\begingroup\$

Assumptions

You made a lot of assumptions with this code, which you either did not mention during the interview or you incorrectly assumed to be true of the question. In other words, you were over thinking the problem.

#check if input is a list if not isinstance(int_arr, list): raise TypeError("Input shall be of type array.")

You should not care about the instance type. The type could easily have been a user defined type which behaves like a list or even another python built in. For example, python has both deque and array, and they both behave like a list, supporting the same operations as a list.

# check all elements are of type int if not all(isinstance(item, int) for item in int_arr) : raise ValueError("All elements in array shall be of type int.")

This is not necessarily true because lists or collections in general, in python can contain many different types. So insisting that the list contains only integers is just imposing a requirement which did not exist in the first place.

In closing, I would advice that you adhere to the KISS principle in future interviews and to ask questions or state your assumptions before diving into the code. Even if it doesn't sound like an assumption to you, make sure they know what is going on in your head either as you're coding or before you write your code. It might sound silly to say "Ok I will also make sure that I have been given a list", but you will be saving yourself a lot of grief when they reply, "Don't worry about that, just assume it's a list".

Check if array contains the sequence (1,3,4)

def check_sequence(arr): return any((arr[i], arr[i + 1], arr[i + 2]) == (1,3,4) for i in range(len(arr) - 2))

answered Dec 15, 2016 at 6:42

smac89smac89

1,47510 silver badges23 bronze badges

\$\endgroup\$

\$\begingroup\$

KIS[S]

def sequence_contains_sequence(haystack_seq, needle_seq): for i in range(0, len(haystack_seq) - len(needle_seq) + 1): if needle_seq == haystack_seq[i:i+len(needle_seq)]: return True return False

We can't know why your interviewer rejected your application, but these types of questions are often starting points for conversation--not finished product endpoints. If you write the simplest, most straightforward code you can, you and your interviewer can then talk about things like expansion, generalization, and performance.

Your interviewer knows that asking you to change your function interface is more problematic because you'll also have to change all your [unasked for] unit tests. This slows down the process and might make the interviewer worry that you'll pollute their codebase with a lot of brittle code.

answered Dec 14, 2016 at 18:25

brian_obrian_o

8395 silver badges9 bronze badges

\$\endgroup\$

2

How do you check if a sequence is in a list Python?

In python, the in operator can be used to check the presence of a value in a sequence(string, list, tuple, set, dictionary). It returns a boolean value 'True' if the item is found in the sequence else it returns False. Let us consider a list my_list = ["bat", 3, "apple", "dog", 4.8 ].

How do you find the order of a number in a list Python?

Practical Data Science using Python.
count := 0, ans := 0..
for i in range 2 to size of nums, do. if nums[i] - nums[i - 1] is same as nums[i - 1] - nums[i - 2], then. count := count + 1. otherwise, ... .
if count is non-zero, then. ans := ans + quotient of (count *(count + 1)) / 2..
return ans..

How do you find all occurrences of an element in a list?

Find index of all occurrences of an item in a Python list.
Using enumerate() function. To get the index of all occurrences of an element in a list, you can use the built-in function enumerate(). ... .
Using range() function. ... .
Using itertools module. ... .
Using more_itertools module. ... .
Using NumPy Library..

Can we use Find method for list?

I was trying to write an answer to this question and was quite surprised to find out that there is no find method for lists, lists have only the index method (strings have find and index).

Chủ đề