Lightweight Browser

Yes, I'm quite sure we all know how it works. Let me try & summarise:

- The server can pass through a static html file (which may contain any number of client-side scripts). The only thing it "injects" are http headers. I wouldn't call this injection.

- The server can partly or completely generate the page using Node, PHP, ASP, Perl, SSI, etc. It can send & receive & "inject" anything from any other server, but it will ultimately arrive as static html (with or without client-side scripts). I wouldn't call this injection, unless some remotely received data is not what the web developer thought it would be, which is probably bad, but which the developer should have anticipated. Always clean your input from any source.

- The server may post-process the generated page and inject stuff. For example, banners for "Made on Shitty Service, Inc." on an otherwise free homepage service. This is unfortunate because it modifies the page as the web developer intended it for you, and you may get ads and trackers which can hopefully be blocked by a client-side ad blocker. I would call this injection, but it's probably not too bad unless the server is malicious or it has been taken over by someone malicious.

can modify the stream and inject stuff. This is the ugliest sort of injection.

- The static html can contain client-side scripts (javascript) or links to external sources (like iframes or even seemingly static images) which, if allowed by the client, can run and modify and load anything from any server. There are *some* restrictions in modern browsers but this can get very ugly. On the other hand, this is how "web apps" operate nowadays: transfer a skeleton html file plus a set of scripts, load the important stuff afterwards (like your email, if it's an email web viewer). So it's probably injection but if it's legit and as intended, then :shrug:

Reply to
A. Dumas
Loading thread data ...

E.g. every WordPress page out there, which is quite a lot!

It's a horrible way to create web pages IMHO and I see no particularly good reason for it except that everyone knows (or thinks they know) how to interface PHP with mySql.

--
Chris Green
Reply to
Chris Green

In general yes. The exception is, when you really *are* dealing with a large database and use HTML to present its content. See

formatting link

Also look at (picking one at random):

formatting link
tml

Arguably it would be far better to have one single PHP page inserting the different images as required instead of generating hundreds of nearly identical static pages through local scripts and uploading them all. Think of finding one typo. Case one: Edit one single page, done. Case two: Edit one template, regenerate hundreds of pages and uploade them all again.

I don't as yet (simply using someone else's Wordpress template is not knowing and being able to do it myself). But it would be worth learning.

--




/ \  Mail | -- No unannounced, large, binary attachments, please! --
Reply to
Axel Berger

I don't have much issue with JS the language. JS won over Java because Java wasn't integrated with web pages - a Java applet was just a rectangle into which it rendered pixels. The same was somewhat true of Flash. Meanwhile JS was able to interact with the DOM which gave us all the kinds of interactive HTML that we're now familiar with. HTML and then JS when you needed it was much more responsive than HTML and a Java app that would take

5-10 seconds to start.

That's not to say JS is perfect. I think it's better as an interpreted language than a precompiled one (as Java bytecode or Flash is) - you still have to run an interpreter or JIT for Java bytecode as you do for JS, so having bytecode actually removes semantic content compared with having the source avaiable. It would have been nice to have stronger typing, but that's largely a problem for programmers not users.

I disagree. It's much better to do the lifting on the client side - because the latency on the client is much lower than having to make requests over the internet. When doing it server-side the user spends a lot of time in 'Waiting...' 'Connecting...' etc. You'd never put up with that for native apps, and you shouldn't for the web either.

I'm sure from the ad network's perspective :)

That's fine if it's a 'page' of static content like a newspaper. If it's Facebook much of what people do won't work (no likes, no posting, no replies, no messaging, no photo tagging, no videos, etc). Yes you can read Facebook like a newspaper but most people want interaction. And FB has no concept of 'pages' - it's an infinite scroll powered by JS.

I'm not sure that's intrinsic, merely that Docs is designed to run on desktops/laptops (lots of RAM, big screens, etc) and the app is designed for a mobile environment (small screens, less RAM, less connectivity). They have been optimised for niches and it's not clear that is a function of the language.

(everything from the Atom line of processors is slow, IMX)

Interesting. There's some other Webkit- and Blink-based browsers that might be worth playing with, although I'm not sure what counts as lightweight:

formatting link

Theo

Reply to
Theo

B-02-2-1.html

That sounds like the sort of thing I'd do with Java and SQL. The JDBC class interface is pretty slick and handles large fields well. Tricks like using one field to hold an e-mail MINE part or the HTML text between ... tags.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

I was being specific here, deliberately describing something everybody has seen:a Wikipedia page. Here the page content (text and images) is obviously written by a volunteer editor and sits in some sort of datastore in more or less the format that its author and contributors hit save on. This is what I meant by the 'page content' When a page is requested, more stuff gets added:

- the standard Wikipedia header and footer (not written by the author, but no more contentions than the content of the HTML section). These always have the same format but not written by the page author.

- the half screenful of donation request that gets added every time Wikipedia needs more cash. To me this is no different to any other advert, and again is not written or supplied by the page author.

I'm only describing it this way to emphasize to the OP that a web server can and does inject additional material into a web page before it is sent to the browser.

Similarly, the browser can only display the content of the page it receives with the following exceptions:

- it can and does use tags to retrieve and display images, which may or may not come from the same server as the HTML text containing them. These tags are in the web page as received from the server.

- code such as ad-blockers running in the browser can and does prevent the content these tags reference from being displayed, but cannot inject text, image selection tags, etc that were not on the HTML page sent by the server

I'd remind you that the OP seemed certain that:

- the web server could not modify the page it sends to the browser

- the browser could does add additional material to the page it received from the browser.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

Definitely overused. So many sites download megabytes of javascript to do really simple things that could just be html/css. Made worse by including entire massive libraries/frameworks when only using a small subset (this problem afflicts lots of bloated desktop software too unfortunately).

What's with all those sites where every image is actually a chunk of javascript that goes off and fetches the image? What's wrong with a simple img tag?!? And then of course if you click on a thumbnail, instead of letting the browser display the full size image, it loads a javascript image viewer that is always slower and more cumbersome to use than anything on your local machine :-(

That's not strictly accurate. One of the few things I'd praise Facebook for is that they do provide a non-javascript version, and all those features still work. It's what I always use because it is SOOO much faster than the standard version.

Sadly, many sites do not degrade so nicely with javascript off :-(

*waves* Hi Theo, long time no see :-)

Bryan.

--
RISC OS User Group Of London  -  http://www.rougol.jellybaby.net/ 
RISC OS London Show           -  http://www.riscoslondonshow.co.uk/
Reply to
Bryan Hogan

This may be one of the reasons why static site generators (SSGs) are taking off.

There's no point in utilizing a system like WordPress to generate pages dynamically, if those pages themselves are actually static. It's just a waste of much-needed server resources.

Reply to
Poprocks

"Bryan Hogan" wrote

| What's with all those sites where every image is actually a chunk of | javascript that goes off and fetches the image? What's wrong with a simple | img tag?!?

Yes. That's usually lazy load code, to prevent images loading unless you scroll down that far. But then they don't use a valid SRC attribute, so without script it's broken. But I suppose it makes sense. Most visitors now are on a phone and have no intention of paying attention long enough to actually read a webpage, so why send them images they'll never see?

Today I wanted to read an article at the NYT. I clicked. The page was there but no article. I disabled CSS. Still no article. I looked at the source. There was a vast pile of script that seemed to be calling for something like an array of text strings to assemble the article plus ads via script. Weird stuff. Line after line of base-64 links. When decoded it became something like nyt://article/[GUID here]

Each line had the same GUID, but a numeric parameter differed. Apparently that was some kind of syntax the script recognized as parts available in some kind of server-side database. They had taken numerous steps to obfuscate the actual webpage content, presumably so that no one would see it if they didn't submit to script-based spying and ad display. That usage of script has become common. It's used to break pages, put transparent DIVs on top of links to break those... all sorts of devious stuff to ensure that the only people who ever see the webpage are those people who can't see anything but all the ads jumping around. It's as close as they can currently get to broadcast. They can only give you the files you ask for, but they can design it so that the overall result is a downloaded javascript software program that forces you to let them actively control an animated process -- or see nothing at all. In a way it's as though the page has been adapted to replace the Flash executable.

Reply to
Mayayana

Rendering images with JS as you scroll to them serves two purposes. One benign, one not.

  1. It makes the page in view to load before the content that is scrolled off the bottom. Other wise it could be more random what loads when. This allows you to start reading the page faster, and reduces the memory footprint, and bandwidth if you are not going to scroll at all.
  2. It lets the page owner track how far you have scrolled, and how fast you did that scrolling.

They had nothing that worked with Lynx last time I tried interacting with them; and that interaction was trying to get them to stop emailing me.

Indeed.

Elijah

------ wants pages to work in lynx

Reply to
Eli the Bearded

Ugh, yeah. When I was reading this thread on Friday, I tried testing Facebook on Links, and not only did it not work at all, but I got an email from Facebook saying they noticed a "suspicious" login attempt, and blah blah blah...

Reply to
Poprocks

It's a newish attribute but avoids using js for that.

Reply to
Andy Burns

Thank you for the summary. This is exactly my view of the things. Also to prevent the so called "injection" by 3rd party, there is SSL, which makes it probably impossible to inject anything.

I engaged in this discussion because the statements made by some members would be confusing or misleading.

Reply to
Deloptes

Why do you think so? What makes you think the author/owner of Wikipedia did not concent with this?

regards

Reply to
Deloptes

There may be, if those pages can thereby be indexed and centrally backed up and generally subjected to all the cool stuff databases make possible. How much overhead is it to read a text file out of a database as against out of a native file system?

--
Those who want slavery should have the grace to name it by its proper  
name. They must face the full meaning of that which they are advocating  
 Click to see the full signature
Reply to
The Natural Philosopher

There are CMSs that use a database locally during editing, then generate a set of static files and upload them to a 'plain' web server.

Reply to
Andy Burns

I know I was just having fun - sorry.

A worthwhile and trivial expenditure.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

The Wiki contributers may have consented as part of the T&C. Read those to find out - I'm not one of their editors so could care less.

In any case, as I've said, the simple and obvious way to tout for cash contributions is to handle it via a web server extension which the sysadmins enable when the bank balance is getting low.

Sites that inject adverts will be using a similar mechanism, if only because most adverts are images while Wikipedia's cash requests are, IIRC, text inserts.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

I said server side - I used to work on the Yahoo! front page team. I know exactly what I am talking about. While the page is being constructed on the server *before* it is sent to the client adverts are injected into the page by the ad server (which is a fearsome beast).

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
 Click to see the full signature
Reply to
Ahem A Rivet's Shot

"Deloptes" wrote

| Also to prevent the so called "injection" by 3rd party, there is SSL, which | makes it probably impossible to inject anything. |

Not hardly.

formatting link

formatting link

When people want script and iframes for spying and ads, there's no way to keep it clean. A typical scenario is that a malware spreader buys ad space, which shows up in iframes, which allows exploiting cross-site scripting vulnerabilities.

Iframes and script were both being phased out before the Web turned into ad servers with "content" stuffing. Now both are ubiquitous. That can never be made safe. The irony is that allowing ads to load is now one of the riskiest activities online.

Your description above is accurate in theory, but there are all sorts of variations. One is that many webmasters really don't know what's being pulled in. They copied some code to get fancy fonts. They copied code to use jquery. Do they know where those are coming from or whether they're up to date? Not likely. Then there are 4th and

5th parties. A page pulls in script from 2 outside sources. They pull in script from 5 more. It mushrooms. The original idea with the Internet, and the design of cookies, was to ensure privacy and security within a site. That's turned into something more like a public bathroom in Times Square with only 2 walls. Does the webmaster at the site you visit know that? Not likely. They're busy trying to find writers to produce "content" for pennies so they can get paid. If 17 ad companies, dataminers, and general sleazeballs being pulled in via script will provide more money then why not? The webmaster doesn't actually understand how that process is working, anyway.

In the case of ad attacks, again no one's minding the store. Someplace like AOL or NYT or TheHill (some of the sites compromised in the past) just add code snippets to call in Google ads. They have no further interest, except to get paid. Google then auctions that ad space to the highest bidder. They have no further interest except to get paid. So someone in China or Russia buys an iframe and launches an attack.

This is the epidemic disease of Silicon Valley. It's the same reason Facebook is infested with propaganda. The geeks worship technology and regard human input as a failure of technology. If humans have to be involved it costs much more and runs more slowly. So it's all automated. And the "content-producing" corporate sites are only looking at profits.

Reply to
Mayayana

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.