Program to calculate Hash and Size of a torrent












2














I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self is used.



import hashlib, bencode

class Torrent(object):

def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)

def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()

def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize


torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)









share|improve this question









New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Welcome to Code Review! Commendable presentation of purpose and concerns!
    – greybeard
    4 hours ago
















2














I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self is used.



import hashlib, bencode

class Torrent(object):

def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)

def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()

def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize


torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)









share|improve this question









New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




















  • Welcome to Code Review! Commendable presentation of purpose and concerns!
    – greybeard
    4 hours ago














2












2








2







I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self is used.



import hashlib, bencode

class Torrent(object):

def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)

def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()

def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize


torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)









share|improve this question









New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I wrote this small program to open a .torrent file, retrieve its info-hash and size. I'm still a beginner at Python so the main focus of this program was to try and utilize classes and objects instead of having just a bunch of functions. It works but I wanted to know if this is a good design. Also, I feel some of the code I wrote is redundant especially the number of times self is used.



import hashlib, bencode

class Torrent(object):

def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.md5hash(self.info)
self.size = self.size(self.files)

def md5hash(self, info):
return hashlib.sha1(bencode.bencode(info)).hexdigest()

def size(self, files):
filesize = 0
for file in files:
filesize += file['length']
return filesize


torrentfile = Torrent(open("test.torrent", "rb"))
print(torrentfile.md5hash)
print(torrentfile.size)






python object-oriented






share|improve this question









New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 4 hours ago





















New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 5 hours ago









Labrinth

113




113




New contributor




Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Labrinth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












  • Welcome to Code Review! Commendable presentation of purpose and concerns!
    – greybeard
    4 hours ago


















  • Welcome to Code Review! Commendable presentation of purpose and concerns!
    – greybeard
    4 hours ago
















Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
4 hours ago




Welcome to Code Review! Commendable presentation of purpose and concerns!
– greybeard
4 hours ago










1 Answer
1






active

oldest

votes


















2














Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.



In terms of the specific code, you're doing things slightly wrong in 2 ways.



Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.



Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.



There are 2 alternatives that spring to mind, depending on what you want to do:



If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:



def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']
self.md5hash = self.calculate_md5hash()
self.size = self.calculate_size()

def calculate_md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

def calculate_size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize


Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache



lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.



from functools import lru_cache

class Torrent(object):

def __init__(self, torrentfile):
self.metainfo = bencode.bdecode(torrentfile.read())
self.info = self.metainfo['info']
self.files = self.metainfo['info']['files']

@lru_cache()
def md5hash(self):
return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

@lru_cache()
def size(self):
filesize = 0
for file in self.files:
filesize += file['length']
return filesize


Then call the methods explicitly:



print(torrentfile.md5hash())
print(torrentfile.size())





share|improve this answer























    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "196"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Labrinth is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210700%2fprogram-to-calculate-hash-and-size-of-a-torrent%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.



    In terms of the specific code, you're doing things slightly wrong in 2 ways.



    Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.



    Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.



    There are 2 alternatives that spring to mind, depending on what you want to do:



    If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:



    def __init__(self, torrentfile):
    self.metainfo = bencode.bdecode(torrentfile.read())
    self.info = self.metainfo['info']
    self.files = self.metainfo['info']['files']
    self.md5hash = self.calculate_md5hash()
    self.size = self.calculate_size()

    def calculate_md5hash(self):
    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

    def calculate_size(self):
    filesize = 0
    for file in self.files:
    filesize += file['length']
    return filesize


    Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache



    lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.



    from functools import lru_cache

    class Torrent(object):

    def __init__(self, torrentfile):
    self.metainfo = bencode.bdecode(torrentfile.read())
    self.info = self.metainfo['info']
    self.files = self.metainfo['info']['files']

    @lru_cache()
    def md5hash(self):
    return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

    @lru_cache()
    def size(self):
    filesize = 0
    for file in self.files:
    filesize += file['length']
    return filesize


    Then call the methods explicitly:



    print(torrentfile.md5hash())
    print(torrentfile.size())





    share|improve this answer




























      2














      Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.



      In terms of the specific code, you're doing things slightly wrong in 2 ways.



      Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.



      Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.



      There are 2 alternatives that spring to mind, depending on what you want to do:



      If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:



      def __init__(self, torrentfile):
      self.metainfo = bencode.bdecode(torrentfile.read())
      self.info = self.metainfo['info']
      self.files = self.metainfo['info']['files']
      self.md5hash = self.calculate_md5hash()
      self.size = self.calculate_size()

      def calculate_md5hash(self):
      return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

      def calculate_size(self):
      filesize = 0
      for file in self.files:
      filesize += file['length']
      return filesize


      Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache



      lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.



      from functools import lru_cache

      class Torrent(object):

      def __init__(self, torrentfile):
      self.metainfo = bencode.bdecode(torrentfile.read())
      self.info = self.metainfo['info']
      self.files = self.metainfo['info']['files']

      @lru_cache()
      def md5hash(self):
      return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

      @lru_cache()
      def size(self):
      filesize = 0
      for file in self.files:
      filesize += file['length']
      return filesize


      Then call the methods explicitly:



      print(torrentfile.md5hash())
      print(torrentfile.size())





      share|improve this answer


























        2












        2








        2






        Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.



        In terms of the specific code, you're doing things slightly wrong in 2 ways.



        Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.



        Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.



        There are 2 alternatives that spring to mind, depending on what you want to do:



        If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:



        def __init__(self, torrentfile):
        self.metainfo = bencode.bdecode(torrentfile.read())
        self.info = self.metainfo['info']
        self.files = self.metainfo['info']['files']
        self.md5hash = self.calculate_md5hash()
        self.size = self.calculate_size()

        def calculate_md5hash(self):
        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

        def calculate_size(self):
        filesize = 0
        for file in self.files:
        filesize += file['length']
        return filesize


        Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache



        lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.



        from functools import lru_cache

        class Torrent(object):

        def __init__(self, torrentfile):
        self.metainfo = bencode.bdecode(torrentfile.read())
        self.info = self.metainfo['info']
        self.files = self.metainfo['info']['files']

        @lru_cache()
        def md5hash(self):
        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

        @lru_cache()
        def size(self):
        filesize = 0
        for file in self.files:
        filesize += file['length']
        return filesize


        Then call the methods explicitly:



        print(torrentfile.md5hash())
        print(torrentfile.size())





        share|improve this answer














        Using classes for this seems like a good idea, since you end up with a number of class instances, where each one represents a specific torrent.



        In terms of the specific code, you're doing things slightly wrong in 2 ways.



        Firstly, you don't need to pass instance parameters into the methods of a class. So you can access info and file as self.info and self.file, so your methods only need the self argument.



        Secondly, I can see that you're doing this to try to cache the results of the method calls by overriding the methods in __init__, and while caching is good, this is a bad way of trying to achieve it.



        There are 2 alternatives that spring to mind, depending on what you want to do:



        If you always want the size and hash calculated when the class is instantiated, then do something similar to what you're doing now, but use different names for the data variables and the methods:



        def __init__(self, torrentfile):
        self.metainfo = bencode.bdecode(torrentfile.read())
        self.info = self.metainfo['info']
        self.files = self.metainfo['info']['files']
        self.md5hash = self.calculate_md5hash()
        self.size = self.calculate_size()

        def calculate_md5hash(self):
        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

        def calculate_size(self):
        filesize = 0
        for file in self.files:
        filesize += file['length']
        return filesize


        Alternatively, if you only want the hash and size calculated when the methods are specifically called, but you also want caching, use lru_cache



        lru_cache will cache the result of a function the first time it is run, and then simply return the result for future calls, providing the arguments to the function remain the same.



        from functools import lru_cache

        class Torrent(object):

        def __init__(self, torrentfile):
        self.metainfo = bencode.bdecode(torrentfile.read())
        self.info = self.metainfo['info']
        self.files = self.metainfo['info']['files']

        @lru_cache()
        def md5hash(self):
        return hashlib.sha1(bencode.bencode(self.info)).hexdigest()

        @lru_cache()
        def size(self):
        filesize = 0
        for file in self.files:
        filesize += file['length']
        return filesize


        Then call the methods explicitly:



        print(torrentfile.md5hash())
        print(torrentfile.size())






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 2 hours ago









        Mathias Ettinger

        23.5k33182




        23.5k33182










        answered 4 hours ago









        match

        4765




        4765






















            Labrinth is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Labrinth is a new contributor. Be nice, and check out our Code of Conduct.













            Labrinth is a new contributor. Be nice, and check out our Code of Conduct.












            Labrinth is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f210700%2fprogram-to-calculate-hash-and-size-of-a-torrent%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Eastern Orthodox Church

            Zagreb

            Understanding the information contained in the Deep Space Network XML data?