• 0 Posts
  • 24 Comments
Joined 2 years ago
cake
Cake day: June 18th, 2023

help-circle

  • The article you linked answers most of your questions.

    1. Relative global upstream traffic went down, but not due to other file-sharing protocols but entirely different applications
    2. I2P is not mentioned anywhere in the article, nor any other sharing alternative
    3. VPN is mentioned as a potential reason for not being able to identify torrent traffic; VPN has become much more prevalent and promoted in the scene
    4. The article says, in piracy, streaming websites are much more popular now

    It has not been surpassed by another protocol. The relative numbers don’t say much about absolute numbers or usage.

    And 10 % of global internet upload is certainly no irrelevancy.




  • If you do it right, you can have that AI replace all the complicated pirating and downloading process.

    How so? I don’t see how that would work.


    What are you trying to say about an AI fabricating a whole paper? It must have the same issues all trained statistical text prediction “AI” has: Hallucinations. Even if it’s extended with sources, without validating them the paper text claims are useless when you can’t be sure the source even exists or says what it claims.

    There are use cases for AI, but if you are looking for papers for reasoned and documented information, AI is the worst you can use. Because it may look correct, but be confidently incorrect, and you are being misled.

    This post is about scientific papers. Not predicted generated text.






  • Depending on what you want to scape, that’s a lot of overkill and overcomplication. Full website testing frameworks may not be necessary to scrape. Python with it’s tooling and package management may not be necessary.

    I’ve recently extracted and downloaded stuff via Nushell.

    1. Requirement: Knowledge of CSS Selectors
    2. Inspect Website DOM in Webbrowser web developer tools
      1. Identify structure
      2. Identify adequate selectors; testable via browser dev tools console document.querySelectorAll()
    3. Get and query data

    For me, my command line terminal and scripting language of choice is Nushell:

    let $html = http get 'https://example.org/'
    let $meta = $html | query web --query '#infobox .title, #infobox .tags' |  | { title: $in.0.0 tags: $in.1.0 }
    let $content = $html | query web --query 'main img' --attribute data-src
    $meta | save meta.json
    

    or

    1..30 | each {|x| http get $'https://example.org/img/($x).jpg' | save $'($x).jpg'; sleep 100ms }
    

    Depending on the tools you use, it’ll be quite similar or very different.

    Selenium is an entire web-browser driver meaning it does a lot more and has a more extensive interface because of it; and you can talk to it through different interfaces and languages.