Moderators: pcwzrd13, deluxux, VasiliyRS
Ian Micheal wrote:Do you have an example of this very simple i can learn off..
I have to have this dumbed down for me (basically, replace the top byte of the pointer with 0x11), set some of the DMA registers, then SQ there.
Or maybe where it is in this emulator.. I cant seem to find that part..
Code: Select all
void pvr_tex_lmemset32(pvr_ptr_t dst, int l, size_t len) {
len /= 32;
//Set PVR DMA registers
volatile int *pvrdmacfg = (int*)0xA05F6888;
pvrdmacfg[1] = pvrdmacfg[0] = 0;
//Set QACR registers
volatile int *qacr = (int*)0xFF000038;
qacr[1] = qacr[0] = 0x11;
//Get SQ area address for the texture location
volatile int *sq = (int*)(0xe1000000 | ((uintptr_t)dst & 0xffffff));
//Initialize store queues
sq[0] = l; sq[1] = l; sq[2] = l; sq[3] = l;
sq[4] = l; sq[5] = l; sq[6] = l; sq[7] = l;
sq[8] = l; sq[9] = l; sq[10] = l; sq[11] = l;
sq[12] = l; sq[13] = l; sq[14] = l; sq[15] = l;
//Write to texture
while(len--) {
__asm__ __volatile__("pref @%0" : : "r" (sq) : "memory");
sq += 8;
}
}
TapamN wrote:Ian Micheal wrote:Do you have an example of this very simple i can learn off..
I have to have this dumbed down for me (basically, replace the top byte of the pointer with 0x11), set some of the DMA registers, then SQ there.
Or maybe where it is in this emulator.. I cant seem to find that part..
Here's a texture memory memset function. It only works on 32 byte blocks.Code: Select all
void pvr_tex_lmemset32(pvr_ptr_t dst, int l, size_t len) {
len /= 32;
//Set PVR DMA registers
volatile int *pvrdmacfg = (int*)0xA05F6888;
pvrdmacfg[1] = pvrdmacfg[0] = 0;
//Set QACR registers
volatile int *qacr = (int*)0xFF000038;
qacr[1] = qacr[0] = 0x11;
//Get SQ area address for the texture location
volatile int *sq = (int*)(0xe1000000 | ((uintptr_t)dst & 0xffffff));
//Initialize store queues
sq[0] = l; sq[1] = l; sq[2] = l; sq[3] = l;
sq[4] = l; sq[5] = l; sq[6] = l; sq[7] = l;
sq[8] = l; sq[9] = l; sq[10] = l; sq[11] = l;
sq[12] = l; sq[13] = l; sq[14] = l; sq[15] = l;
//Write to texture
while(len--) {
__asm__ __volatile__("pref @%0" : : "r" (sq) : "memory");
sq += 8;
}
}
Some comments on the code you posted:
There's no need to try to wait for SQs to finish. SQs always start immediately (if the CPU can't start the SQ right away, because some kind of memory transfer is already happening, the CPU will stall until it can start the SQ). And while the SQ is happening, it's not possible to access the memory bus (if you try to, the CPU will stall until the SQ transfer completes). So there's never a point you can access RAM or an external hardware register before the SQ write completes.
Also, there's no performance advantage to using OCBI. As long as you don't write to a cache line, it can be unloaded for free automatically when something else needs to be loaded in. OCBI is really only for reading from memory that has been changed by an outside device (e.g. DMA wrote something to RAM, and you want to read the new data and not old cached data).
Using SQs to write textures would really only noticably faster than DMA if you can do something while the SQ is being sent. Like if you're decoding video, using SQs while converting YUV to RGB makes sense (e.g. convert some YUVs from cache, write a bit to texture memory with SQs, repeat), but doing all YUV conversion at once then doing all texture writes with SQs at once is unlikely to be faster than DMA.
Actually, now that I think about it, SQs might be faster for small transfers. From what I've noticed, DMA seemed to have a bit of a startup delay. Using SQs might be faster for certain transfer sizes (like less than one or two kilobytes), even if you can't find something else to do. This needs to be benchmarked to know for certain, and for what sizes.
Code: Select all
void pvr_tex_lmemset32(pvr_ptr_t dst, int l, size_t len) {
len /= 32;
//Set PVR DMA registers
volatile int *pvrdmacfg = (int*)0xA05F6888;
pvrdmacfg[1] = pvrdmacfg[0] = 0;
//Set QACR registers
volatile int *qacr = (int*)0xFF000038;
qacr[1] = qacr[0] = 0x11;
//Get SQ area address for the texture location
volatile int *sq = (int*)(0xe1000000 | ((uintptr_t)dst & 0xffffff));
//Initialize store queues
sq[0] = l; sq[1] = l; sq[2] = l; sq[3] = l;
sq[4] = l; sq[5] = l; sq[6] = l; sq[7] = l;
sq[8] = l; sq[9] = l; sq[10] = l; sq[11] = l;
sq[12] = l; sq[13] = l; sq[14] = l; sq[15] = l;
//Write to texture
while(len--) {
__asm__ __volatile__("pref @%0" : : "r" (sq) : "memory");
sq += 8;
}
}
Code: Select all
/* copies n bytes from src to dest, dest must be 32-byte aligned */
void * sq_cpy2(void *dest, const void *src, int n) {
unsigned int *d = (unsigned int *)(void *)
(0xe0000000 | (((unsigned long)dest) & 0x03ffffe0));
const unsigned int *s = src;
/* Set store queue memory area as desired */
QACR0 = ((((unsigned int)dest) >> 26) << 2) & 0x1c;
QACR1 = ((((unsigned int)dest) >> 26) << 2) & 0x1c;
/* fill/write queues as many times necessary */
n >>= 5;
while(n--) {
__asm__("pref @%0" : : "r"(s + 8)); /* prefetch 32 bytes for next loop */
d[0] = *(s++);
d[1] = *(s++);
d[2] = *(s++);
d[3] = *(s++);
d[4] = *(s++);
d[5] = *(s++);
d[6] = *(s++);
d[7] = *(s++);
__asm__("pref @%0" : : "r"(d));
d += 8;
}
/* Wait for both store queues to complete */
d = (unsigned int *)0xe0000000;
d[0] = d[8] = 0;
return dest;
}
Code: Select all
void pvr_tex_lmemset32(pvr_ptr_t dst, int l, size_t len) {
len /= 32;
//Set PVR DMA registers
volatile int *pvrdmacfg = (int*)0xA05F6888;
pvrdmacfg[1] = pvrdmacfg[0] = 0;
//Set QACR registers
volatile int *qacr = (int*)0xFF000038;
qacr[1] = qacr[0] = 0x11;
//Get SQ area address for the texture location
volatile int *sq = (int*)(0xe1000000 | ((uintptr_t)dst & 0xffffff));
//Initialize store queues
sq[0] = l; sq[1] = l; sq[2] = l; sq[3] = l;
sq[4] = l; sq[5] = l; sq[6] = l; sq[7] = l;
sq[8] = l; sq[9] = l; sq[10] = l; sq[11] = l;
sq[12] = l; sq[13] = l; sq[14] = l; sq[15] = l;
//Write to texture
while(len--) {
__asm__ __volatile__("pref @%0" : : "r" (sq) : "memory");
sq += 8;
}
}
Code: Select all
/* copies n bytes from src to dest, dest must be 32-byte aligned */
void * sq_cpy(void *dest, const void *src, int n) {
unsigned int *d = (unsigned int *)(void *)
(0xe0000000 | (((unsigned long)dest) & 0x03ffffe0));
const unsigned int *s = src;
/* Set store queue memory area as desired */
QACR0 = ((((unsigned int)dest) >> 26) << 2) & 0x1c;
QACR1 = ((((unsigned int)dest) >> 26) << 2) & 0x1c;
/* fill/write queues as many times necessary */
n >>= 5;
while(n--) {
__asm__("pref @%0" : : "r"(s + 8)); /* prefetch 32 bytes for next loop */
d[0] = *(s++);
d[1] = *(s++);
d[2] = *(s++);
d[3] = *(s++);
d[4] = *(s++);
d[5] = *(s++);
d[6] = *(s++);
d[7] = *(s++);
__asm__("pref @%0" : : "r"(d));
d += 8;
}
/* Wait for both store queues to complete */
d = (unsigned int *)0xe0000000;
d[0] = d[8] = 0;
return dest;
}
Code: Select all
//Set QACR registers
volatile int *qacr = (int*)0xFF000038;
qacr[1] = qacr[0] = 0x11;
Code: Select all
/* Set store queue memory area as desired */
QACR0 = ((((unsigned int)sbuf->vramData)>>26)<<2)&0x1c;
QACR1 = ((((unsigned int)sbuf->vramData)>>26)<<2)&0x1c;
Code: Select all
#define NONCACHED(a) (typeof (&(a)[0]))(((unsigned int)(a)) | (1 << 29))
#define CACHED(a) (typeof (&(a)[0]))(((unsigned int)(a)) & ~(1 << 29))
#define OCI_BANK0(a) (typeof (&(a)[0]))(((unsigned int)(a)) & ~(1 << 25))
#define OCI_BANK1(a) (typeof (&(a)[0]))(((unsigned int)(a)) | (1 << 25))
Code: Select all
/** \brief Store Queue 0 access register */
#define QACR0 (*(volatile unsigned int *)(void *)0xff000038)
/** \brief Store Queue 1 access register */
#define QACR1 (*(volatile unsigned int *)(void *)0xff00003c)
Code: Select all
void StreamRender_DisplayFrame( StreamBuffer * sbuf )
{
while(!sbuf->frames)
thd_pass();
//printf("RenderFrame: Buffer Contains %i frames\n", sbuf->frames );
while(sbuf->locked)
thd_pass();
sbuf->locked = 1;
StreamTexturePVR( sbuf );
sbuf->locked = 0;
pvr_wait_ready();
pvr_scene_begin();
pvr_list_begin( PVR_LIST_OP_POLY );
sq_cpy( (void*)0x10000000, VERTEX_BUFFER, VERTEX_COUNT * 32 );
pvr_list_finish();
pvr_scene_finish();
}
KmusDC wrote:Hello bro, excellent contribution, good to see updates for this magnificent emulator. I would like to know how to convert it to an iso for dreamshell. Greetings
WedgeStratos wrote:I am currently going through a TOSEC full-set to apply only the games that work with this emulator. A number of issues exist, particularly with raster graphics, so there are NO racing games that work, and it causes graphical issues in a number of games like Ecco or Comix Zone. With that said?
Gens4SSP: The Sega Smash Pack Stand-In. Download here.
This uses a rebuilt ELF provided by Ian Michael (thanks m8) and contains 22 games collected from the 5 separate Sega Smash Pack releases across PC, Dreamcast and GBA, plus stand-ins for the games that don't work.► Show Spoiler
This has only been tested via Demul and Redream in emulation, and via GDemu on retail hardware, so I cannot be held liable for excess coasters.
Again, I am testing the TOSEC for valid games. It will be a few days, but I have already completed everything from A-E in the games offered in the US NTSC catalog. A game has to be playable with no major graphical defects that hinder the experience, as well as not having sound issues.
Released further in this thread, or linked here
Return to “New Releases/Homebrew/Emulation”
Users browsing this forum: No registered users