Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask question.(5)

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

ITtutoria

ITtutoria Logo ITtutoria Logo

ITtutoria Navigation

  • Python
  • Java
  • Reactjs
  • JavaScript
  • R
  • PySpark
  • MYSQL
  • Pandas
  • QA
  • C++
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Python
  • Science
  • Java
  • JavaScript
  • Reactjs
  • Nodejs
  • Tools
  • QA
Home/ Questions/The: ''Cannot use a string pattern on a bytes-like object'' error - What should I do?
Next
Answered
Kaira O'Neill
  • 11
Kaira O'Neill
Asked: May 18, 20222022-05-18T18:45:26+00:00 2022-05-18T18:45:26+00:00In: python

The: ”Cannot use a string pattern on a bytes-like object” error – What should I do?

  • 11

. Advertisement .

..3..

. Advertisement .

..4..

I get the “cannot use a string pattern on a bytes-like object” error as the title says. How can I fix it so the error goes away? Here is my detail:

import urllib.request
 import re
 
 url = "http://www.google.com"
 regex = r'<title>(,+?)</title>'
 pattern = re.compile(regex)
 
 with urllib.request.urlopen(url) as response:
  html = response.read()
 
 title = re.findall(pattern, html)
 print(title)

When I operated it, I received the error text:

Traceback (most recent call last):
  File "path\to\file\Crawler.py", line 11, in <module>
  title = re.findall(pattern, html)
  File "C:\Python33\lib\re.py", line 201, in findall
  return _compile(pattern, flags).findall(string)
 TypeError: can't use a string pattern on a bytes-like object

I appreciate any help from you.

python-3.x
  • 2 2 Answers
  • 100 Views
  • 0 Followers
  • 0
Answer
Share
  • Facebook
  • Report

2 Answers

  • Voted
  • Oldest
  • Recent
  • Random
  1. Best Answer
    lyytutoria Expert
    2022-06-21T02:09:56+00:00Added an answer on June 21, 2022 at 2:09 am

    The cause:

    You are facing up with this error because you are attempting to use a string to match with a bytes object. Moreover, python is not be informed that the bytes have been encoded. Therefore, the error happens after you run your program.

    Solution:

    To fix this error, you must change html (a bytes object) into a string by using .decode as following:

    html = response.read().decode('utf-8')

    A bytes regex also is not bad method for this error, so you can use it:

    regex = rb'<title>(,+?)</title>' 
    #        ^

    Let’s read this article to get more knowledge Change bytes to a Python String.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report
  2. Ambre Bernard
    2022-05-26T03:55:30+00:00Added an answer on May 26, 2022 at 3:55 am

    Your regex is a string but html bytes.

    >>> type(html)
    <class 'bytes'>

    Because python does not know the encoded bytes, it throws an error when you attempt to use string regex.

    You can decode add bytes to a string.

    html = html.decode('ISO-8859-1') # encoding may vary!
    title = re.findall(pattern, html) # no more error

    You can also use a bytes regex

    regex = rb'<title>(,+?)</title>'
    # ^

    This particular context allows you to get the encoding using the response headers.

    with urllib.request.urlopen(url) as response:
     encoding = response.info().get_param('charset', 'utf8')
     html = response.read().decode(encoding)

    For more information, see the urlopen documentation .

    • 14
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

Sidebar

Ask A Question
  • How to Split String by space in C++
  • How To Convert A Pandas DataFrame Column To A List
  • How to Replace Multiple Characters in A String in Python?
  • How To Remove Special Characters From String Python

Explore

  • Home
  • Tutorial

Footer

ITtutoria

ITtutoria

This website is user friendly and will facilitate transferring knowledge. It would be useful for a self-initiated learning process.

@ ITTutoria Co Ltd.

Tutorial

  • Home
  • Python
  • Science
  • Java
  • JavaScript
  • Reactjs
  • Nodejs
  • Tools
  • QA

Legal Stuff

  • About Us
  • Terms of Use
  • Privacy Policy
  • Contact Us

DMCA.com Protection Status

Help

  • Knowledge Base
  • Support

Follow

© 2022 Ittutoria. All Rights Reserved.

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.