Lightweight Browser

Amen to that.

I have web sites. Some use no cookies at all. others use cookies purely to maintain state information.

The lie that 'cookies are necessary for this site's operation' should be made a criminal offence..

--
The lifetime of any political organisation is about three years before  
its been subverted by the people it tried to warn you about. 

Anon.
Reply to
The Natural Philosopher
Loading thread data ...

Good enough for images but not for some of the more extreme uses of AJAX. Enough time has passed that I can give a simplified overview of how the Yahoo! front page used to work (it clearly doesn't work that way now) when it was a complex mass of widgets.

Page request comes in and is recognised by the front end proxy as such and sent to the VIP for the 'fast' server farm. That server decides which page layout is needed and fires off all the internal service requests for outline content and status for each widget in the layout. The SLAs on the internal services are *really* tight so they serve from cached data. When all of those are back the page is assembled with ad-server hooks and sent to the user. As the page is served the hooks are replaced by advert content. At this point the widgets have really minimal content usually just enough to show the status lines and load the right images and fix the page layout. Once the page hits the browser they all send requests for fill in content so that the mouseovers and popup links start to work. Those requests hit the front end proxy and get sent to the 'slow' server farm, which also updates the cached data used by the internal services.

Simplified because I've left out a whole pile of proprietary detail on scaling, geographic distribution, CDN usage, tracking, failure handling, DDOS handling etc. etc. What remains pretty much applies to any "Web 2.0" AJAX based page.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
The computer obeys and wins.                |    licences available see 
You lose and Bill collects.                 |    http://www.sohara.org/
Reply to
Ahem A Rivet's Shot

"The Natural Philosopher" wrote

| On 16/08/2020 21:04, Poprocks wrote: | > There's no point in utilizing a system like WordPress to generate pages | > dynamically, if those pages themselves are actually static. It's just a | > waste of much-needed server resources. | | There may be, if those pages can thereby be indexed and centrally backed | up and generally subjected to all the cool stuff databases make possible. | How much overhead is it to read a text file out of a database as against | out of a native file system? |

That's what companies like Wix are doing. Wordpress seems to be similar if you use them as host. With Wix it's even more extreme. The website URL, such as acme.com, is redirected to the Wix server. The page is little more than a JSON list. There isn't actually any server at acme.com.

It's a way to provide fast, efficient and easy website creation through WYSIWYG tools online, with the website owner being able to completely ignore the technical details. It's also a design that allows Wix to set up what might be called web design scam #1: Get the customer to sign on to hosting service so that they'll have to pay a monthly subscription and you can hold their website files hostage.

There may also be advantages in terms of selling visitor data, putting in ads, etc. Since I don't allow script I can see little or nothing of Wix-hosted websites, so I don't know how adulterated they might be. But I do know that a lot of small businesses are using Wix.

I'd guess that most changeable corporate sites are also using backend databases. Something like a newspaper changes content frequently. The number of pages is vast. And probably all of them are already using backend code, anyway. A truly static webpage is rare these days. I set my browser cache limit at 10 MB because virtually every page I visit registers as a new page. Nothing's ever coming out of the cache because there's never a 304 server code returned!

Reply to
Mayayana

"Ahem A Rivet's Shot" wrote

| I said server side - I used to work on the Yahoo! front page team. | I know exactly what I am talking about. While the page is being constructed | on the server *before* it is sent to the client adverts are injected into | the page by the ad server (which is a fearsome beast). |

That might have been true in the days of banners being served from the same domain. These days it's typically a code snippet calling Google/Doubleclick with a company ID, from javascript in the page, clientside. That's not any kind of "injection". It's script in the page.

There haven't been static ad images in pages for years. It was hard to track the number of ad views, and there was no targetting. So no one wanted to pay for them. These days it's all dynamic, script-based, calling in vast spyware/ad companies like Google/Doubleclick. The actual site you're visiting has almost no part in it. Which is actually convenient in a way. A short HOSTS file blocks virtually all ads and tracking.

Loading yahoo.com just now show's that they're trying to get me to call in 4 spyware beacons through NOSCRIPT tags. If I allowed script they'd be calling atwola.com for an ad, sending them my userAgent, OS and version, the fact that I'm on a desktop (or their guess that I am), and various other bits and IDs. So they did generate that code custom and sent it to me in the page, but any ad would be a clientside scripting operation carried out with atwola. Since I block both script and atwola, they never get called and I can still read about George Clooney or get my horoscope without being tracked or seeing ads. :)

Reply to
Mayayana

"Andy Burns" wrote

| It's a newish attribute but avoids using | js for that.

That's interesting. I didn't know about it. But it is *very* new. Chrome 76, according to w3c. So not likely to be used for some time. Also, there's still the problem of pages trying to guess your screen size and then load one of several size options based on that. That's also completely broken without script. Typically, if there's any default SRC at all it's tiny.

Reply to
Mayayana

but how do you know it is web server extention? More often it is something embedded in the web page that is doing the job, which was put there deliberately by the web page owner.

Look, it is about who takes the responsibility. In your words some third party without the knowledge of web site owner alters the content, which IMO is not correct.

I can understand that is has become a plague during last decade or so, but as it was explained everyone wants to see cache, so all in the chain know what is going on and caching on it.

The fact that the content comes via SSL to you means no one is altering it between you and the server and it means (presumably the server is not compromised) you see exactly what the web site owner wanted you to see including the tons of advertisement that no one knows where is coming from.

Reply to
Deloptes

Standard issue web servers only read the requested plain text from the host filing system and send that to the requester without modification. Images etc are retrieved and displayed by the browser as it prepares the page of text for display to the user. This is all the web server is designed and built to do.

The code needed to process anything, such as PHP or Javascript inserts in the web page is not a core part of the web server, so you have to install code such as PHP and Javascript interpreters separately from the HTML server and then configure the server to call them. By definition these external functions are known as server extensions since they are not integral parts of the web server.

Since the extension calling mechanism exists, it is also possible to write custom extensions in C, Java or any other language you prefer and either embed triggers for them in the web pages you're serving or to call an extension to generate the page text instead of reading it from a filing system.

The latter is how sites such as eBay and Amazon work and is also how adverts get injected into pages.

Don't put words in my mouth.

I said that the AUTHOR of the page does not make these alterations, but the *owner of the web server* can and often does configure his webserver to do stuff such as injecting adverts into pages that other people, e.g. Wikipedia volunteer editors, have written.

Not strictly true - Man-in-the-middle-attacks can do exactly that. Almost certainly The Great Wall Of China could do it too, but it mostly just blocks stuff it doesn't like. GCHQ and the NSA could also modify messages in transit if they wanted to but are more interested in reading them.

Correct

also correct.

incorrect: the web server owner knows exactly where the ads are coming from because he's getting paid to inject them into outgoing pages.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

yeah that's about right, safari is the laggard

That's what is meant for in combination with multiple child elements with media selectors for various screen sizes

Reply to
Andy Burns

No they don't - ad's are supplied through one, or often many layers of brokers, each taking their fraction of a cent cut, and passing the buck of responsibility for any malware which may be delivered from the ultimate end point.

---druck

Reply to
druck

Yes that's how Google and DoubleClick and just about everyone else serve ads to other people. It is *not* how Yahoo! inserted ads into their own AJAX heavy front pages, it is probably not how Google insert ads into their own pages, but of that I have no direct knowledge.

Nobody said anything about static ad images - these things are widgets injected complete with HTML, CSS and JavaScript.

Yes yes yes - that's why the ad server was such a beast. It locates a suitable ad based on the user's data, satisfying all the dozens of contract terms and standards about how often the same ad or ads in the same group may appear to the same person and a whole bunch of other stuff and then provides the HTML with all the tracking baked in for the webserver to inject where the marker is. It does all of this in an obscenely short time.

I was talking about Yahoo! Who do their own thing and like to run their own spyware.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
The computer obeys and wins.                |    licences available see 
You lose and Bill collects.                 |    http://www.sohara.org/
Reply to
Ahem A Rivet's Shot

I can imagine pages doing that, but I'd use CSS (no JS required) for that. Eg, my blog has CSS to scale images to page width, but only for smaller screens. This page has example pictures:

https://qaz.wtf/qz/blosxom/2020/03/15/reuters

https://qaz.wtf/qz/css/main.css

@media (max-width: 800px) { /* ... */ .mainblock { padding: 1vh 2vw; width: 96vw; left: 0; } /* ... */ .imgbox > img { width: 90vw; height: auto; } }

I don't do anything like that for blocks which makes the terminal text screenshots in the most recent post look quite bad. I'm not even sure what the best fix would be. Shrink the font? Grow the background and force side scrolling?

Elijah

------ by no means a CSS expert

Reply to
Eli the Bearded

Has a nasty tendency to produce beautifully laid out pages with unreadable text.

Is a PITA.

Yep there's no good answer - you can only attempt to make the best compromise.

--
Steve O'Hara-Smith                          |   Directable Mirror Arrays 
C:\>WIN                                     | A better way to focus the sun 
The computer obeys and wins.                |    licences available see 
You lose and Bill collects.                 |    http://www.sohara.org/
Reply to
Ahem A Rivet's Shot

Agree. I often use text for blocks and, if as is often the case, the block represents console output, I'll wrap command lines as needed and and backslash continuation markers. Works on on PC screens, probably not so well on a phone or small pad.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

Fair enough - I don't dispense ads or use an ISP that does.

So, I'd assumed that, if a webserver owner wanted to sprinkle ads on web pages being transmitted from his site, that he'd sign up with one of the ad brokers to get an ad stream from them plus a cut of the ad income, and that he would in turn pass a pittance on to the page authors.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

"Andy Burns" wrote

| > So not likely to be used for | > some time. Also, there's still the problem of pages trying to | > guess your screen size and then load one of several size | > options based on that. That's also completely broken without | > script. | | That's what is meant for in combination with multiple | child elements with media selectors for various screen sizes |

People are using it requiring script. There's a good example here:

formatting link

Audubon photography awards. They've got 5 different image sizes specced! Then there's IMG DATA_SRC, which is one of them. But there's no IMG SRC. So without script there's no image at all. The whole design is idiotic. They could have just used the default size with a note to "click to view full size". Or better yet, show default size with a link to "click to download all images full size". Which is what I edited the page to do. :)

Reply to
Mayayana

Inspecting the DOM the final does have a src= attribute, but I'm probably seeing it after their script has run and fiddled with the data-src attribute and populated the src= on the fly from it, or whatever, seems a pointless way to use it.

Reply to
Andy Burns

[ re: https://qaz.wtf/qz/blosxom/2020/08/14/notint ]

Well... I'm not thinking about shrinking everything on the page/site, just the block that doesn't fit on the screen of a phone.

Currently the text overflows the background I've put in for the block, which is set to phone screen width.

In this particular case, I've got essentially ASCII art. I'm showing two different terminal-based Tetris screens, and didn't want to use an image because then it breaks in text browsers like lynx / links / etc.

(Screen readers, of course, do very poorly with ASCII art, so there's no winning.)

Elijah

------ hoping some of them at least can let the listener skip content

Reply to
Eli the Bearded

Now I Understand exactly what you were on about and why.

And now, small graphical thrill for the younger members of this group, who may not have heard of this gem or, indeed, of asciimation:

formatting link

I'm pleasantly surprised to see its still online and not consigned to the Internet Archive.

--
Martin    | martin at 
Gregorie  | gregorie dot org
Reply to
Martin Gregorie

I still have the filling you do not use the terms properly, especially the term "injection". This is my problem.

According my understanding there is no "injection", but "loading" will be more appropriate as the operators of a site do allow advertisements and similar to be loaded from third party on the web site. You usually consent with this when visiting the site.

You also seems to not accept the fact that there are dynamic web pages generated by webserver extentions such as the apaches php extention. Please, note that whatever methodology is used, the visitor of a web site receives a HTML sent to the browser. The dynamic part is only the generation of this HTML opposed to the static content you pretend to be the only "true" HTML.

I think you should review your understanding of the above and stop confusing the audience.

Thank you in advance!

Reply to
Deloptes

Injection is the term that I've seen used.

Basically you have some system that creates or holds precreated pages. Something else along the way edits those in-transit to include per user additions, typically ads.

The difference between "injection" and general server-side dynamic (as you'd have with PHP), is pages to get injection are designed with specific places for specific types of things to be added, but the injector does do anything else to the page. It's all about *adding* stuff, hence *injection*.

The one I worked with was OpenAdStream, 15 to 18 years ago. It ran as an Apache module and was intended for companies who sold their own ad inventory. We used it for weird ass page personalization, but not very successfully. The NYTimes used it for in-house ads at about that same time, I remember recognizing the artifacts.

It's equally possible to do this sort of work with a whole separate program or device sitting somewhere in front of your web server, kinda like Varnish will sit in front of your webserver to act as a cache.

Ad injection can also be done by hostile third parties, and is one of the things that https works to prevent. Typically those ones have very naive ad placentment options.

These pages are much more sophisticated in power, but are not as sharply focused on finding the most profitable guess for an ad.

There's not one "true" HTML for a lot of sites. Consider Yahoo or Gmail: all of the page is customized for a logged in viewer. From email to list, to preferences on colors, to news stories to show, to ads to place.

Elijah

------ OAS seems to still exist, but not sure if it's the same product

Reply to
Eli the Bearded

ElectronDepot website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.