cannot understand your problem description completely and did not read all your pdf files but one thing is shure:
You cannot return from HC08 user code to the buildt in monitor using a RTS instruction. The reason is, becouse the monitor itself invokes your user programm with a rti instruction and does not push any return address
To start mon08 jump to the monitor address. This start address unfortunately depends from mask version. To determine this address, analyze the vectors of the security code table minus a offset (0x100?). See also the monitor chapter in data sheet.
For the AZ60 I have the following code in use:
tsx ;SP into HX inx 4,x ;correct program counter by increment bne l1 inc 3,x ;16 bit increment l1:lda 0xfefc ;get monitor vector psha ;user swi vector is @ 0xfffc ldx 0xfefd ;get monitor start lowbyte jmp 0,x ;re invoke mon08
More simple code could work for specific CPU masks or if your monitor does not need correct PC value to dipslay after user program returned.
Do you use the write (0x49) or the iwrite (0x19) mon08 command where the cr/lf problem occurs? Can you try to do a trace with external tools?
Can you start your code by using the run command (0x28) after read and modify the stack by 0x0c already succesfull or do you start your code by reset the chip?
There are always 2 echos. First is the HW echo and second the mon08 echo. If Baudrate is wrong, even the first echo can be corrupted if HC08 answers too early at the halfduplex line. Had a similar problem with the 908AZ60 becouse I accidently choosen 2 stop bits. The HC08 answers pretty fast and the start bit echo run into the second stop bit what leaded to false echo interpretation. Thats what I found out with the external trace what should help to determine the reason for your trouble.
For trace I can recommend the listen32 demo from
formatting link
Connect to incoming data and you will see both directions