Detaylar
Öneriler
Image Positioner V1
Directed Image Positioning v1

Directed Image Positioning v1

18
5
70
#concept
#FLUX

Tell it where you want the images to be placed in your scene with natural language prompting. Experimental.

...

After some experimentation it does seem to enhance the image coherency and positional adherence. Pics below with notes in images.

I have added a public dataset of PNG images with metadata in JSON I created with Python. The images are not captioned but should be better than anything I found online if anyone cares to join me in my experiments.


I used GeoShapes that were hand labeled with just the smaller shapes with their approximate location in various plain language descriptions, no formal way, then let an AI add the descriptive captions after them. If I had a more formal dataset of positional images i could do magic.

From ChatGPT:

That’s a clever approach! Using GeoShapes with plain language labeling for training is an innovative start, as it provides the AI a loose "positional vocabulary" to work with. This flexibility allows it to generalize placement and context, which is why you’re seeing those natural, somewhat structured compositions.

To elevate this even further, a more formal positional dataset would be incredibly powerful. Here are a few ideas for constructing such a dataset to amplify the precision and adaptability of your model:

1. Gridded Coordinate System: Create images with shapes positioned according to a defined grid (e.g., 3x3 or 5x5). Each grid cell would represent a different placement: top-left, center-right, bottom-middle, etc. Labels would include grid-based coordinates (e.g., (1,1) for top-left), allowing for a finer positional mapping. This structured approach could help the AI learn precise placements.

2. Layered Position Descriptions: Instead of just approximate descriptions, use layered terms. For instance, combine location (e.g., top-left, bottom-center), size (small, medium, large), distance from edges, and angles (rotation, tilt). This would give the model richer metadata, allowing for more specific placements, rotations, and even layering of shapes.

3. Structured Language for Composition Rules: By integrating a formalized language structure that includes rules for composition (e.g., “centered vertically, offset slightly left horizontally”), the model could learn to arrange shapes based on relational positions—how one object should sit relative to others or the background.

4. Dynamic Background and Lighting Contexts: Incorporate variations in backgrounds and lighting to teach the model how positional accuracy changes in context. For example, adding shadows based on hypothetical light sources could make positional interpretation even more robust.

5. Use of Bounding Boxes or Anchor Points: Label images with bounding boxes or anchor points for shapes, with coordinates that specify exact placements. These markers would help the AI learn exact placements rather than just "approximate" positioning, useful for precise layouts.

6. Combined Shapes in Hierarchical Order: Introducing overlapping or adjacent shapes with a hierarchy of importance (e.g., primary, secondary) would let the model understand layered compositions. For example, a small red square within a larger green square can be tagged as "centered within green square," helping the AI learn containment and layering nuances.

Once you have this kind of dataset, you could train the AI to handle much more complex layouts, orientations, and compositions. With such a structured dataset, the potential for precise layout and arrangement—whether for graphic design, illustration, or even UI composition—would be almost endless. Let me know if you’d like to discuss any of these ideas further or explore ways to implement them!

Çeviriyi görüntüle

Puan ve Yorumlar

-- /5
0 Puan

Henüz yeterli değerlendirme veya yorum alınmadı

no-data
Veri Mevcut Değil
A
Model ile konuşma
Duyuru
2024-11-14
Model yayınlama
2024-11-14
Model bilgilerini güncelle
Model detayları
Tür
LORA
Yayınlanma tarihi
2024-11-14
Temel Model
Flux.1 D
Sürüm tanıtımı

Re-uploaded to change the name. Don't be mad if it isn't perfect or doesn't work very well lol, I don't know yet.

Lisans Kapsamı
Model Source: civitai

1. Yeniden paylaşılan modellerin hakları orijinal yaratıcılara aittir.

2. Orijinal model yaratıcıları, modellerini talep etmek için resmi kanallar aracılığıyla SeaArt AI personeliyle iletişime geçmelidir. Talep etmek için tıklayın

Yaratım Lisansı Kapsamı
Çevrimiçi Görsel Üretimi
Birleştirme İşlemi
İndirime İzin Ver
Ticari Lisans
Üretilen resimler ticari amaçlarla kullanılabilir veya satılabilir
Modelin yeniden satılmasına veya birleştirildikten sonra satılmasına izin verilir
QR Code
SeaArt Uygulamasını İndir
Mobil cihazınızda AI yaratıcılık yolculuğunuza devam edin